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EXECUTIVE  SUMMARY 


On  July  23,  the  Library  of  Congress  hosted  the  second  SIGWAIS  (Special  Interest 
Group  on  Wide  Area  Information  Servers).  Sponsored  jointly  by  the  U.S.  Geological 
Survey  and  the  Clearinghouse  for  Networked  Information  Discovery  and  Retrieval 
(CNIDR)  as  well  as  by  the  Library  of  Congress,  the  theme  of  this  meeting  was  "Libraries 
and  Internet  Databases:  Quality  and  Navigation."  Approximately  250  attendees  from 
the  federal  goverimient,  academic  institutions,  and  the  private  sector  convened  to  share 
information  about  their  work  in  developing  and  making  use  of  WAIS  and  related 
network  tools. 

The  morning  session  focused  on  the  policy  and  communications  environments  that 
are  supporting  the  dissemination  of  public  information  using  the  Internet,  and  on  the 
emerging  social  aspects  of  the  national  and  (international)  networked  communication 
and  information  infrastructure. 

The  afternoon  session  offered  eight  brief  technical  discussions  highlighting  a 
variety  of  WAIS  implementations  in  the  pubKc  and  private  sectors.  Included  in  these 
sessions  was  a  panel  discussion  answering  "frequently  asked  questions"  about  the  Z39.50 
standard  protocol. 

Demonstrations  of  a  variety  of  WAIS  (and  related)  tools  were  offered  in  the 
Library's  National  Demonstration  Lab.  ITS  joined  demonstrations  by  NASA,  USGS, 
Picture  Elements,  and  Wais,  Inc.,  and  other  organizations  with  its  demonstration  of  the 
Library's  newly  created  OS/2  WAIS  client. 

Library  staff  instrumental  in  making  the  conference  possible  were  a  planning  team 
that  included  Mary  Bernheisel  (ITS),  Audrey  Fischer  (ITS),  Chuck  Gialloreto  (ITS),  Dan 
Gold  (CDS),  Anna  Keller  (CPO),  Bob  Morgan  (ITS),  Virginia  Sorkin  (ITS),  and  Joe 
Wright  (ITS).  Other  Library  staff  whose  support  was  critical  included  Jane  Bortnick 
Griffith  and  ITS  Director  Herbert  Becker,  as  well  as  Tom  Littlejohn  (ITS),  Ray 
Denenberg  (Network  Development  Office),  John  Ragsdale  (CRS),  and  Jacqueline  Hess 
and  EUyn  Blanton  of  the  National  Demonstration  Lab. 

SIGWAIS-II  documents,  including  many  speakers'  presentations,  are  being  made 
available  on  the  Library's  Campus  Wide  Information  System,  LC  MARVEL.  To  access 
them,  point  your  gopher  to:  marvel.loc.gov  and  login  as  marvel;  or  telnet  to 
marvel.loc.gov  70  and  login  as  marvel. 
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REMARKS  TO  SIGWAIS  MEETING 


Jane  Bortnick  Griffith,  Library  of  Congress 

Acting  Associate  Librarian  for  Science  and  Technology  Information 

It  is  my  pleasure  to  welcome  all  of  you  here  today  to  the  Library  of  Congress  to 
participate  in  the  second  SIGWAIS  meeting.  The  Library  is  pleased  to  be  part  of  the 
growing  Internet  community  and  is  committed  to  continuing  our  involvement  in  Internet 
activities.  It  is  appropriate  to  have  this  meeting  here  because  the  Library  of  Congress  is 
an  institution  with  strong  ties  to  librarians,  technology  developers,  and  poHcymakers-all 
key  players  in  the  evolution  of  an  advanced  information  infrastructure.  We  believe  that 
dissemination  of  information  over  the  Internet  will  enable  us  to  reach  far  beyond  the 
existing  LC  user  community  to  a  vastly  expanded  array  of  users  throughout  the  United 
States  and  internationally.  At  the  same  time  we  recognize  that  the  Internet  offers  us 
new  opportunities  for  acquiring  information  more  efficiently,  collaborating  with  other 
information  providers,  and  improving  the  effectiveness  of  our  internal  operations. 

Let  me  just  mention  a  few  milestones  here  at  the  Library  to  illustrate  our 
increasing  involvement  in  use  of  the  Internet.  In  August  1990,  the  Library  established  its 
initial  coimection  to  the  Internet  at  56K.  In  January  1992  that  coimection  was  upgraded 
to  T-1  speed  (1.54M).  In  March  of  last  year  selected  documents  from  the  "Revelations 
from  the  Russian  Archives  "  exhibit  were  scaimed  and  the  images  made  available  as  the 
Library's  first  anonymous  FTP  directory.  Selections  from  all  succeeding  major  LC 
exhibits  also  have  been  made  available  via  FTP.  These  include  "1492:  An  Ongoing 
Voyage";  "Rome  Reborn:The  Vatican  Library  and  Renaissance  Culture";  and  "Scrolls 
from  the  Dead  Sea."  The  number  of  people  who  have  been  able  to  view  these  exhibits 
online  greatly  surpasses  those  able  to  actually  visit  the  exhibits.  The  Russian  material 
came  from  the  Communist  Party  archives  that  had  never  before  been  seen.  And  even 
today,  the  Russian  Government  does  not  make  those  archival  materials  available  to  the 
pubhc.  Therefore  downloading  them  from  the  Internet  is  the  only  way  Russians  can 
have  access  to  them. 

Over  a  year  ago  an  Internet  User's  group  was  formed  here  at  LC.  That  user's 
group  has  operated  much  in  the  spirit  of  Internet  users  everywhere-it  is  essentially  a 
grassroots  organization  comprised  of  staff  willing  to  devote  time  to  enhancing  their  own 
and  the  Library's  use  of  Internet.  That  group  continues  to  be  a  driving  force  in  LC's 
Internet  activities  and  deserves  much  credit  for  the  fine  work  they  have  done.  In  March 
of  this  year,  LC  established  its  first  Listserv  "The  Library  of  Congress  Cataloging 
Newsline"  and  in  April  the  Library  of  Congress  Information  System  (LOCIS)  containing 
card  catalog  and  legislative  information  became  available.  Just  this  month  we  unveiled 
out  Gopher-software  Campus- Wide  Information  System,  which  we  call  MARVEL,  to  the 
Internet  community.  The  response  has  been  overwhelmingly  positive. 

The  Library  also  has  a  number  of  efforts  underway  for  expanding  our  use  of 
Internet.  A  WAIS  OS2  client  is  expected  to  be  completed  this  summer.  Demos  of  it  can 


be  seen  today  in  the  Atrium.  An  evaluation  of  the  WAIS  commercial  server  began  this 
month,  It  is  being  used  to  search  the  MARVEL  databases  containing  the  Library's 
regulations,  a  microforms  records  database,  and  "Publications  in  Print."  Selections  from 
the  Library's  American  Memory  project  containing  collections  of  original  Americana 
materials  are  being  made  available  as  WAIS  databases.  These  include  Civil  War 
photographs,  broadsides  from  the  collection  "Documents  of  the  Continental  Congress 
and  Constitutional  Convention,  ca.  1774-1790";  and  documents  from  the  state  of  Indiana 
collection  "Life  Histories  from  Folklore  Project,  Works  Progress  Administration,  Federal 
Writer's  Project,  1936-  1939."  The  Library  also  is  evaluating  use  of  the  WAIS  server  for 
a  number  of  other  textual  materials. 

Another  Internet  pilot  project  underway  is  the  Law  Library's  International  Legal 
Information  Network  which  is  transferring  images  of  foreign  statutes  from  Mexico  and 
Brazil.  An  abstract  is  then  created  from  these  images  and  made  available  on  the 
Library's  SCORPIO  online  retrieval  system.  In  addition,  the  Library  is  participating  in 
an  Advanced  Research  Projects  Agency  (ARPA)  project  being  carried  out  by  the 
Corporation  for  National  Research  Initiatives  (CNRI)  to  prototype  an  electronic 
copyright  management  system  that  will  initially  store  and  disseminate  technical  reports  in 
computer  science  generated  by  five  universities.  The  project  includes  handling  electronic 
submission  of  documents  (including  digital  signatures  for  verification),  storage  in  an 
online  depository,  the  digital  transfer  of  rights  and  permissions,  electronic  payment,  and 
user  interfaces.  As  you  can  see  from  this  listing  of  projects,  the  Library  is  very  much 
involved  in  a  variety  of  efforts  to  expand  access  to  information  electronically.  We 
recognize  that  this  is  just  a  beginning.  Many  of  you  in  this  room  are  the  pioneers  in 
expanding  use  of  the  Internet.  We  look  forward  to  benefitting  from  continuing 
interaction  with  others  in  the  Internet  community  and  sharing  ideas  of  mutual  interest. 

This  is  a  very  exciting  time  for  those  of  us  in  the  "information"  field.  The  rate  of 
progress  for  developing  new  tools  for  accessing  information  electronically  is  astounding. 
The  amount  of  information  being  made  available  on  the  networks  is  so  enormous  that  it 
creates  real  challenges  for  us.  The  number  of  users  continues  to  expand  at  an 
exponential  rate.  And  information  issues  are  receiving  attention  in  Congress,  the 
Administration,  and  even  the  press  as  never  before.  As  a  poUcy  analyst  in  this  field  for 
many  years,  I  used  to  wonder  whether  these  issues  would  ever  attract  much  attention. 
Today,  articles  on  the  information  superhighway  appear  in  the  daily  papers  and  in 
popular  magazines. 

Congress  is  considering  a  number  of  bills  related  to  promoting  electronic  access  to 
information.  These  include  bills  to  promote  development  of  an  information 
infrastructure;  bills  to  enhance  access  to  government  information;  bills  promoting 
appUcations  for  schools,  libraries,  and  health  services;  and  bills  focusing  on  privacy  and 
intellectual  property  rights.  It  would  take  too  long  to  go  through  all  of  these  this 
morning,  but  I  beheve  we  have  made  a  listing  of  them  available  to  you.  What  seems 
clear,  however,  is  that  a  number  of  these  bills  are  receiving  priority  attention  and  that 
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they  will  contribute  to  the  policy  framework  necessary  for  advancing  a  national 
information  infrastructure.  For  example,  P.L.  103-40,  the  Government  Printing  Office 
Electronic  Information  Access  Enhancement  Act  of  1993  promotes  electronic  public 
access  to  government  information,  including  the  Congressional  Record  and  the  Federal 
Register.  H.R.  1757,  the  High  Performance  Computing  and  High  Speed  Networking 
Applications  Act  of  1993  contains  a  section  on  appUcations  for  government  information 
that  authorizes  projects  for  "connecting  depository  libraries  and  other  sources  of 
government  information  to  the  Internet." 

These  are  just  two  of  several  bills  related  to  information  infrastructure  issues. 
Just  last  week  the  Library  hosted  a  one  day  conference  entitled  "Delivering  Electronic 
Information  in  a  Knowledge-Based  Democracy"  which  brought  together  about  40 
participants  from  industry,  government,  and  the  Ubrary  community  to  discuss  the  policy 
framework  essential  to  creating  an  advanced  information  infrastructure.  The  three  main 
themes  of  the  conference  were:  Building,  locating,  and  preserving  the  electronic  store  of 
knowledge;  pubUc  and  private  sector  roles;  and  mechanism  for  safeguarding  intellectual 
property  rights.  Vice  President  Al  Gore,  who  served  as  Honorary  Chair  stated  at  the 
meeting  that  the  construction  of  the  information  superhighway  will  facilitate  a  new 
distillation  process,  where  raw  data  is  pressed  into  information,  the  information  is 
distilled  into  knowledge,  and  then  knowledge  is  fermented  into  wisdom.  The  people  in 
this  room  are  certainly  important  players  in  that  distillation  process. 

The  Administration  likewise  is  devoting  considerable  attention  to  building  a 
national  information  infrastructure  (Nil).  The  White  House  has  recently  formed,  under 
the  aegis  of  the  National  Economic  Council  and  the  Office  of  Science  and  Technology 
PoHcy,  the  Information  Infrastructure  Task  Force  (IITF)  which  will  be  chaired  by  the 
Secretary  of  Commerce.  Much  of  the  work  of  the  task  force  will  be  done  through  three 
committees:  Telecommunications  PoUcy,  Information  PoUcy,  and  Applications.  In 
addition  a  private  sector  advisory  group  will  be  established.  Through  this  task  force,  a 
variety  of  issues  relating  to  promoting  a  national  information  infrastructure  will  be 
addressed  and  strategies  developed. 

I  have  provided  an  overview  of  how  we  here  at  the  Library  are  working  toward 
enhancing  our  use  of  the  Internet,  as  well  as  how  Congress  and  the  Administration  are  ^ 
addressing  related  public  policy  issues.  It  is  my  pleasure  now  to  introduce  this  morning's 
other  speakers  who  will  present  perspectives  from  their  organizations  on  libraries  and 
Internet.  I  also  would  like  to  express  our  great  appreciation  to  Virginia  Sorkin  and 
Aima  Keller  for  all  their  work  in  putting  today's  meeting  together. 
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REMARKS  ON  NEW  ADMINISTRATIVE  INITIATIVES 
IN  INFORMATION  POLICY 

Peter  Weiss 

Office  of  Information  and  Regulatoiy  Affairs 
Office  of  Management  and  Budget 

I  would  like  to  reiterate  a  few  things  that  Jane  said.  This  administration, 
particularly  my  boss,  Sally  Katzen,  the  new  Administrator  of  the  Office  of  Information 
and  Regulatory  Affairs,  the  Director  of  OMB,  Mr.  Leon  Fanetta,  and  Vice-President 
Gore,  have  all  said  to  us,  on  their  staffs,  and  have  said  pubhcly  on  a  number  of 
occasions,  that  they  are  fully  committed  to  the  development  of  governmental  policy  that 
will  aid  us  in  the  transition  to  an  electronic  environment  both  in  the  way  we  do  our 
day-to-day  business  and  in  the  way  we  make  government  information  available  to  the 
pubhc. 

There  are  a  couple  of  parameters  to  this.  Jane  mentioned  one  of  them;  that  is 
the  Information  Infrastructure  Task  Force  being  jointly  managed  and  set  up  under  the 
auspices  of  the  National  Economic  Council  and  the  Office  of  Science  and  Technology 
Policy  with  heavy  involvement  from  the  Secretary  of  Commerce,  from  my  boss  Sally 
Katzen,  and  from  a  number  of  agencies. 

We  hope  that  process  will  address  some  of  the  issues  that  have  been  stalled  or 
have  not  been  given  adequate  attention  over  the  last  few  years.  To  give  you  one 
example:  the  group  has  decided  to  look  at  issues  associated  with  the  Freedom  of 
Information  Act  in  the  electronic  environment.  I  do  want  to  reiterate  what  Jane  said 
about  there  being  a  mechanism  set  up  which  vdll  permit  pubUc  input  into  that  process  so 
that  you  folks  will  have  an  opportunity  to  share  your  views  with  that  process.  It  will  not 
be  a  closed  government  figuring  out  what-is-best-for-you  type  of  process.  I  can't  stress 
that  more.  We  are  committed  to  that. 

What  I'd  hke  to  do  now  for  a  couple  of  minutes  is  highlight  a  few  topics  that  are 
contained  in  another  facet  of  the  administration's  approach  to  the  emerging  technologies. 
And  that  is  OMB's  Circular  A-130,  the  first  comprehensive  revision  of  which  was 
published  in  the  Federal  Register  on  July  2.  There  are  a  limited  number  of  copies 
available  today.  They  can  also  be  downloaded  through  anonymous  FTP  from  the 
Internet.  This  is  one  of  the  first  times  that  a  semi-formal  government  document  has 
been  made  broadly  available  on  the  Internet.  Let  me  just  quickly  give  you  the 
instructions.  On  anon  FTP  from,  nis.nsf.net;  the  file  is  entitled  ombomb.al30.rev2. 

This  document  is  an  umbrella  of  federal  information  management  poUcy  and  it  is 
directed  to  the  heads  of  the  various  federal  agencies.  As  .  such,  it  deals  with  a  number  of 
topics:  the  collection  of  information;  records  management;  information  management;  and 
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information  dissemination.  One  of  the  thrusts  of  the  document,  however,  is  a  strong 
emphasis  on  electronic  information  dissemination.  Let  me  just  read  to  you  one  piece 
from  the  explanatory  materials  in  the  document  where  we  talk  about  the  benefits  that 
can  be  derived  from  electronic  information  dissemination  and  then  go  on  to  say  that, 
"the  development  of  public  electronic  information  networks  such  as  the  Internet  provides 
an  additional  way  for  agencies  to  increase  the  diversity  of  information  sources  available 
to  the  public.  Emerging  standards  such  as  Wide  Area  Information  Servers  using  NISO 
Z39.50  will  be  used  increasingly  to  facilitate  dissemination  of  government  information 
such  as  environmental  data,  international  trade  information,  and  economic  statistics  in  a 
networked  enviroimient." 

The  preceding  is  a  statement  of  vision.  And  what  we  attempt  to  do  in  the  body  of 
the  policy  document,  among  many  other  things  in  this  broad-ranging  document,  is  to 
state  a  basic  policy  on  electronic  information  dissemination.  Let  me  just  read  it  to  you: 

"Agencies  shall  use  electronic  media  and  formats  including  public  networks  as 
appropriate  and  within  budgetary  constraints  in  order  to  make  government  information 
more  easily  accessible  and  useful  to  the  public.  The  use  of  electronic  media  and  formats 
for  information  dissemination  is  appropriate  under  the  following  conditions:"  [And  I 
would  note  that  these  conditions  are  not  exclusive,  they  are  illustrative.] 

"A)  The  agency  develops  and  maintains  the  information  electronically.  B) 
Electronic  media  and  formats  are  practical  and  cost-effective  ways  to  provide  access  to  a 
large  highly  detailed  volume  of  information.  C)  The  agency  disseminates  the  product 
frequently.  D)  The  agency  knows  a  substantial  portion  of  users  have  ready  access  to  the 
necessary  information  technology  and  training  to  use  electronic  information 
dissemination  of  products  and  a  change  to  electronic  dissemination  as  the  sole  means  of 
disseminating  the  product  will  not  impose  substantial  acquisition  or  training  costs  on 
users,  especially  state  and  local  governments  and  small  business  entities." 

Those  two  last  points  with  regard  to  making  sure  that  the  audience  is  capable  of 
receiving  the  information  electronically,  and  lastly  making  sure  that  people  are  not 
disadvantaged  who  do  not  have  access  yet  to  the  technologies  and  the  training,  we  feel  is 
important,  particularly  since,  as  you  all  know,  we  are  in  a  transitional  period  and  not 
everyone  yet  is  as  wired  as  you  are. 

So  that,  briefly,  is  what  we've  done  on  electronic  information  dissemination  in 
A- 130.  There  is  also  a  strong  emphasis  on  the  use  of  electronic  techniques,  for  example, 
electronic  data  interchange  for  the  collection  of  information  from  the  public.  There  are 
a  number  of  pilot  projects  going  in  the  government  in  the  area  of  public  purchasing, 
procurement,  and  in  the  area  of  regulatory  matters.  For  example  the  Securities  and 
Exchange  Commission  is  ramping  up  its  electronic  data  gathering  and  retrieval  system. 
In  a  couple  of  years  all  pubUcly  traded  companies  will  no  longer  be  filing  paper  reports. 
The  Environmental  Protection  Agency  is  experimenting  with  a  couple  of  initiatives  to 


automate  some  of  the  enviromnental  filings.  You  can  imagine  that  the  opportunities  go 
on  and  on.  We  feel  very  strongly  that  electronic  information  collection  and  electronic 
information  dissemination  are  two  sides  of  the  same  coin  and  that  the  use  of  these 
techniques  has  a  strong  likelihood,  if  used  properly,  to  speed  up,  and  make  more 
efficient  the  day-to-day  workings  of  the  government  and,  hopefully,  in  the  process  save 
the  taxpayers  money.  So  we  are  committed  to  this. 

In  a  separate  initiative,  my  boss,  Sally  Katzen,  is  developing  guidance  to  the 
agencies  which  will  explicitly  encourage  them  to  consider  electronic  techniques  in  their 
information  collection  activities  and  we  will  be  using  our  authority  under  the  Paperwork 
Reduction  Act  to  try  to  see  that  that  occurs  more  frequently  than  it  does  today. 
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INTERNET  INFORMATION  ACCESS  AND  DELIVERY: 
KEY  CONCEPTS,  TOOLS,  STRATEGIES,  AND  ISSUES 

Paul  Evan  Peters 

Executive  Director 

Coalition  for  Networked  Information 

Washington,  DC 

Introduction 

Craig  [Summerhill]  and  I  want  to  share  some  of  what  the  Coalition  has  learned 
and  is  doing  in  this  very  important  area,  which  is  central  to  the  mission  of  the  Coalition. 

I  will  focus  on  the  current  state  of  the  Internet  information  environment,  and  will 
suggest  a  few  strategies  by  which  you  can  make  progress  in  this  area. 

Craig  will  focus  on  the  standards  and  standards  development  efforts  that  are  most 
relevant  to  this  area,  and  will  suggest  a  few  ways  by  which  you  can  get  involved  with 
those  standards  and  in  those  efforts. 

We  hope  to  provide  a  framework  for  today's  proceedings  and  perhaps  even  for 
some  thinking  and  work  you  will  be  doing  after  today. 

Our  basic  message  to  you  today  is: 

*  Content,  quality,  and  navigation  are  the  value  generating  opportunities  and 
challenges  of  the  contemporary  Internet.  Being  concerned  about  these  things  is 
definitely  one  of  the  right  things  about  which  to  be  concerned. 

*  Information  resource  managers  (by  which  I  mean  librarians,  information 
technologists,  user  support  specialists,  pubhshers,  etc.)  will  play  the  key  role  in  framing 
and  addressing  Internet  content,  quality,  and  navigation  opportunities  and  challenges. 

*  Early  and  continuing  concern  about  "interoperability"  and  participation  in 
standardization  efforts  that  promote  interoperability  of  platforms,  systems,  and  services 
are  extremely  important  factors  for  making  sustainable  progress  in  this  area. 

Session  aerobics 

Reality  check!  How  many  of  you  have: 
Heard  of  the  Internet. 
Use  the  Internet. 

Use  the  Internet  for  something  other  than  electronic  mail. 

Do  any  of  you  think  the  Internet  was  invented  by  the  commercial  fishing 

industry?! 
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Professional  identification.  How  many  of  you  think  of  yourself  as: 

Librarian. 
Technologists. 
Managers. 
UNIX  gearheads?! 

"Jurassic  Park":  "It's  UNIX;  I  can  handle  this;  Cool." 


Our  basic  perspective 

We  represent  a  coalition  of  three  large,  primarily  North  American,  associations 
that  address  various  aspects  of  knowledge  management  and  information  technology  m 
primarily  higher  education  settings:  ARL,  CAUSE,  and  EDUCOM. 

We  manage  a  task  force  of  190+  institutions  and  organizations  that  provide  this 
coaUtion  with  many  of  the  insights,  initiatives,  and  resources  it  needs  to  pursue  its 
mission  of  promoting  the  creation  and  use  of  networked  information  resources  and 
services  to  promote  scholarship  and  intellectual  productivity.  One  third  of  the  members 
of  the  task  force  are  technology,  information,  and  other  providers,  and  many  of  these  are 
for-profit  entities. 

The  members  of  the  Coalition  and  its  task  force  have  made  major,  perhaps  the 
major  investments  of  time,  talent,  and  money  in  the  Internet,  and  they  are  eager  to 
increase  the  returns  on  their  investments  by  promoting  the  use  of  the  Internet  for 
communication  and  pubHcation  as  well  as  for  computation. 

The  Coalition  is  also  a  small  business  (three  folks  and  around  $700K  per  year) 
that  offers  a  variety  of  networked  information  resources  and  services  to  the  Internet 
community.  We  are  ".org"  and  proud  of  it. 

The  current  Internet  Information  Environment 

I  regard  the  Internet  to  be  the  "networked  information  universe"  that  was  formed 
by  the  big  bang  in  cyberspace  that  occurred  in  1986,  when  the  NSFNet  began  production 
operation. 

Three  basic  conditions  produced  the  big  bang: 

*  NSF  supports  basic  rather  than  mission  oriented  research  and  education. 

*  NSF  connects  institutions  rather  than  specific  principal  investigators. 

*  NSF  acts  as  a  "transit"  network,  particularly  for  international  traffic. 
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A  critical  mass  of  performance  and  users  has  formed  in  the  Internet,  and  this 
critical  mass  is  generating  networked  information  resources  and  services  in  a 
spontaneous  and  mutually  reinforcing  manner. 

A  large  portion  of  the  opportunities  and  challenges  presented  by  the 
contemporary  Internet  information  environment  can  be  described  as  embracing  the 
realities  that  it  presents. 

Said  otherwise,  a  paleo-electronic  environment  has  formed  in  the  Internet.  It  is 
an  environment: 

*  In  which  crude  tools  are  being  used  to  fashion  crude  by  functional  artifacts; 

*  In  which  the  dominant  personalities  are  hunters,  gatherers,  and  story-tellers; 

and 

*  In  which  institutions  and  organizations,  including  libraries  and  information 
centers  and  providers  of  all  types,  are  hard  at  work  securing  the  gains  of  these  pioneers 
by  constructing  fixed  settlements  that  are  attractive  to  settlers  who  are  much  more 
interested  in  husbanding  domesticated  flora  and  fauna  than  they  are  in  exploring  what's 
over  the  next  technological  horizon. 

-  "Flora"  =  databases.  They  grow,  requiring  weeding  and  pruning,  and  do  not 
move  from  one  place  to  another  on  their  own  accord. 

-  "Fauna"  =  algorithms.  They  spawn,  infect,  and  have  minds  of  their  own. 

Yes,  my  perspective  on  the  current  state  of  the  network  info-  structure 
represented  by  the  Internet  has  been  influenced  by  my  love  of  science  fiction:  John 
Brunner's  "The  Shockwave  Rider,"  David  Brin's  "Earth,"  Vernor  Vinge's  "A  Fire  Upon 
the  Deep,"  Milton  Wolf  and  R.  Bruce  Miller's  "Intelligent  Robots,  An  Aware  Internet, 
and  Cyberpunk  Librarians." 

But  this  is  not  an  entirely  whimsical  interest  of  mine,  because  language  is 
important  and  metaphors  are  invaluable  wheels  for  the  mind. 

This  is  particularly  interesting  to  me  because  one  metaphor,  that  of  the 
"information  superhighway,"  has  seized  most  if  not  all  of  the  early  engineering  and  public 
pohcy  conceptual  space.  As  is  the  case  with  all  metaphors,  this  has  had  an  effect, 
however  tacitly,  of  including  some  people  and  things  to  the  exclusion  of  other  people  and 
things,  I  have  been  pursuing  other  metaphors  because  of  my  concern  about  this  process 
of  conceptual  and  social  inclusion  and  exclusion,  particularly  at  these  early  stages  in  the 
engineering  and  population  of  cyberspace. 
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*  Key  features  of  the  "information  superhighway"  metaphor  are  that:  The 
Internet  is  being  built  to  meet  specified  requirements;  capitaUsts,  engineers,  and 
regulators  are  playing  the  key  roles;  Apollo  is  the  god  of  the  Internet. 

*  Key  features  of  the  "information  universe"  metaphor  are  that:  The  Internet  is 
growing  in  response  to  aspiration  and  risk  taking;  pioneers,  inventors,  and  evangelists  are 
playing  the  key  roles;  Eros  is  the  god  of  the  Internet. 

*  Other  metaphors  for  the  Internet:  a  quilt;  a  flea  market. 
Strategies  by  Which  to  Make  Progress 

First,  advocate  a  total  Internet  strategy  to  your  management.   This  means  getting 
the  local  info-structure  in  place;  training  and  supporting  end-users  and  staff;  using  the 
Internet  for  document  access  and  delivery;  and  developing  capabilities  for  Internet 
resource  discovery  and  management. 

Second,  pay  close  attention  to  "creative"  behaviors  of  network  users. 
Breakthroughs  will  come  from  folks  who  are  working  on  priorities  unknown  to  the 
managers  who  keep  them  in  computer  cycles  and  network  bandwidth.  Breakthroughs 
will  come  from  folks  who  suffer  from  the  "got  a  hammer,  then  everything  is  a  nail" 
syndrome.  Breakthroughs  will  also  come  from  folks  who  are  too  desperate  or  too  dumb 
to  behave  according  to  the  "received  wisdom."  And,  of  course,  breakthroughs  will  come 
from  professional  information  resource  managers  like  us. 

Third,  advocate  interoperable  platforms,  systems,  and  services.  "It's  a  network, 
stupid!"  "All  the  world's  a  network,  and  all  the  nodes  are  cUents  and  servers."  Think  in 
terms  of  each  client  accessing  multiple  servers  simultaneously. 

Fourth,  be  open  to  fundamental  shifts  in  thinking:  From  acquiring  information  to 
constructing  tailored  information  system  images;  from  cataloging  to  registration;  from 
question-answering  to  current  awareness;  from  users  looking  for  information  to 
information  looking  for  users;  from  users  looking  for  information  to  authors  looking  for 
audiences. 

Finally,  practice,  practice,  practice.  Give  in  to  relatively  more  technology  push, 
than  demand  pull.  And,  when  all  is  said  and  done,  make  sure  that  you  do  more  than 
you  say! 
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CNI'S  WORK  AND  HOW  TO  GET  INVOLVED 


Craig  Summerhill 

Systems  Coordinator  and  Program  Officer 
Coalition  for  Networked  Information 


CNI  is  host  to  several  listservs  related  to  CNI  (listserv@cm.org).  CNI  FTP 
archives  can  be  found  at  ftp.cni.org.  CNI's  Gopher  is  available  at  gopher.cm.org  usmg 
the  standard  port  70.  There  are  also  a  variety  of  databases  recently  created  which  serve 
a  need  for  institutional  list  history.  To  access  these,  telnet  to  a.cm.org  and  logm  as 
brsuser  using  basic  vtlOO  emulation. 

I  don't  need  to  give  this  audience  a  full  history  of  the  Internet,  but  as  a  precursor 
to  the  comments  I  want  to  make,  it's  interesting  to  note  that  the  Internet  developed  m 
the  late  60's  as  a  project  of  the  Advanced  Research  Projects  Agency  usmg  a  protocol 
suite  called  TCP/IP.  Simple  Mail  Transfer  Protocol  (SMTP),  File  Transfer  Protocol 
(FTP)  and  telnet,  the  original  three  applications  developed,  until  a  couple  of  years  ago, 
accounted  for  the  majority  of  Internet  traffic.  Somebody  recently  challenged  my 
statement  that  SMTP  still  accounts  for  the  vast  majority  of  traffic,  contending  instead 
that  Gopher  had  recently  overtaken  SMTP.  I  have  not  been  able  to  find  data  to  vahdate 
that  but  think  that  it  is  possible. 

In  the  last  couple  of  years,  more  advanced  tools  have  been  developed.  The  tools 
are  discussed  in  alphabetical  order  with  the  exception  of  Veronica,  which  I  consider  a 
subset  of  Gopher. 

*  Archie  was  developed  at  McGill  University.  Archie  is  an  automated  system  for 
accessing  and  building  databases  of  anonymous  FTP  archive  sites  across  the  Internet. 

*  Gopher  was  developed  to  provide  a  menuing  system  for  viewing  Internet 
information  servers  hierarchically.  The  librarians  in  the  audience  can  draw  the  metaphor 
between  browsing  the  book  stacks  and  browsing  through  menus  m  Gopher  space. 
Veronica  added  an  Archie-like  capability  to  Gopher  by  allowing  the  user  to  do  some 
basic  keyword  searching  and  later  some  very  basic  Boolean  operand  searching  across  the 
items  which  live  in  Gopher  space,  thereby  providing  some  extensibility  to  Gopher  tor 
retrieving  Internet  information. 

*  NCSA  Mosaic  is  a  single  user  interface  which  provides  access  to  a  wide  variety 
of  Internet  tools.  There  are  buttons  or  cUck-and-drag  menus  that  allow  a  user  to  execute 
WAIS  searches,  access  Gopher  items,  use  FTP  and  finger,  all  of  these  protocols  from  a 
single  application  which  does  not  require  32M  of  memory  to  run. 
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*  There  is  the  Wide  Area  Information  Server  which  you  all  know  something 
about  or  you  would  not  be  here.  The  WAIS  system  uses  the  NISO  Z39.50  protocol. 

*  And  finally  there  is  the  Worldwide  Web  which  was  developed  at  CERN. 

These  are  all  more  advanced  client-server  applications.  One  of  the  things  which 
we  are  beginning  to  see  with  these  systems  is  that  they  perform  certain  functions  well 
and  don't  perform  other  functions  as  well. 

What  is  on  the  horizon?  Multi-purpose  Internet  Mail  Exchange  (MIME)  which 
will  allow  multimedia  to  be  passed  in  mail  on  the  Internet.  It  will  be  a  long  time  before 
this  is  available  at  every  home  or  to  the  world,  but  certainly  within  the  research  and 
development  communities  and  within  most  of  the  organizations  which  you  are  affiliated 
with,  there  are  workstations  which  have  the  capability  to  support  this  type  of  protocol. 
We  already  have  the  ability  to  pass  graphics  and  video  within  electronic  mail 
transmissions,  and  we  have  databases  which  can  serve  up  images  in  a  networked 
environment. 

One  of  the  two  things  on  the  horizon  which  I  want  to  focus  on  is  the  development 
of  uniform  methodologies  for  resource  location  and  identification.  The  Internet 
Engineering  Task  Force  (IETF)  has  working  groups,  under  the  broad  umbrella  of 
Integrated  Information  Architectures,  which  are  very  important  to  the  development  of 
this  methodology.  We  are  going  to  see  far  more  peer-to-peer  communications, 
computers  talking  to  computers,  not  people  necessarily  talking  to  computers.  We  are 
going  to  see  clients  which  incorporate  user  profiles  and  filtering  of  information.  We  are 
very  soon  going  to  be  swimming  in  an  information  sea.  Those  of  you  who  hold  focal 
positions  vdthin  your  organization,  such  as  postmaster,  know  how  overwhelmed  you  are 
with  mail.  We  need  to  have  better  tools  for  filtering  information.  We  are  going  to  see 
telemetry  and  sensor  feeds.  We  already  have  a  sateUite  or  satellites  in  orbit  which 
transmit  North  American  weather  information  to  the  'Net  where  a  computer 
automatically  receives  the  data  and  loads  the  files  in  various  formats  into  an  FTP  archive 
where  the  files  become  accessible.  There  are  two  Internet  workstations  in  Antarctica. 
In  the  future,  you  are  going  to  see  weather  stations  online  which  continuously  broadcast 
weather  information  on  a  specific  frequency  so  that  all  a  user  has  to  do  is  focus  a  chent 
on  that  frequency  to  receive  that  data.  We  will  see  data  that  does  not  require  human 
intervention  for  creation.  In  the  future  we  will  see  more  multi-faceted  cUents  such  as 
Mosaic. 

Issues  which  I  consider  important  are  the  following.  The  registration  model 
becomes  more  prevalent  in  this  environment.  The  maintenance  agency  for  protocols  and 
for  systems  become  increasingly  important.  The  Library  of  Congress  is  the  maintenance 
agency  for  the  NISO  Z39.50  protocol  in  the  United  States.   The  maintenance  agencies 
become  increasingly  important  as  we  begin  to  develop  interoperable  systems,  systems 
which  we  can  rely  on  for  a  long  term  period.  There  will  be  competing  commercial  and 
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non-  commercial  interests  which  come  up.  One  of  my  concerns  is  that  there  is  a  big 
difference,  known  to  the  library  community,  between  known-item  searching  and 
topic-oriented  queries.  Known-item  searching  is  very  easy  to  do  given  some  of  the  tools 
available.  That's  great  if  the  user  knows  something  exists.  But  more  and  more 
information  is  going  to  seek  an  audience,  authors  are  going  to  look  for  someone  to 
distribute  their  goods  to.  People  have  to  have  tools  for  doing  known-item  queries  in  a 
large  meta-space  at  different  layers  of  the  information  universe.  QuaUty  versus  quantity 
is  a  big  issue. 

Granularity  in  directory  service  development  is  very  important.  This  is  a  critical 
issue.  One  of  the  problems  we  have  with  projects  such  as  the  Coalition's  TopNode 
project  and  other  similar  directory  services  projects  is  that  the  people  who  are  building 
the  directories  don't  have  a  very  good  vision  of  what  they  are  trying  to  build.  There  is  so 
httle  information  in  the  network  universe  compared  to  the  analog  universe  of  print  which 
libraries  deal  with  now.  It  is  an  infinitesimal  amount  of  information  with  a  lot  of 
duplication.  People  who  have  been  building  directory  services  have  not  had  a  good 
vision  of  what  their  directory  is  supposed  to  do  or  what  level  of  granularity  the  directory 
needs  to  serve.  There  are  a  wide  variety  of  directories.  Look  at  AT&T  or  Bellcore  type 
white  pages-very  simple  structure:  sometimes  an  address  and  telephone  number; 
sometimes  a  name,  address  and  telephone  number.  There  are  much  more  detailed 
directory  types  of  information  which  can  be  built.  Knowing  the  cut~whether  looking  at 
meta-  or  micro-data  within  a  specific  database-when  building  the  tools,  is  becoming 
increasingly  important  as  the  network's  information  space  grows. 

Supporting  remote  users  is  going  to  be  increasingly  difficult.  Similarly,  how  to 
determine  what  kind  of  meaningful  data  they  are  getting  out  of  their  system  in  a 
wide-area  environment  will  continue  to  pose  a  problem.  When  someone  gets  on  a  video 
link  from  New  York  City  to  L.A.  and  is  searching  one  of  your  databases  and  doesn't 
understand,  you  will  have  to  determine  what  kind  of  client  is  being  used,  what  version  of 
the  software  is  being  used.  These  variables  will  cause  some  interesting  demands  in  terms 
of  technician  and  librarian  user  support  of  our  systems. 

I  am  now  going  to  spend  my  final  two  minutes  on  the  two  items  which  Paul  asked 
me  to  cover.  They  are:  Z39.50  as  an  open  standard  and  the  development  of  that 
standard  in  a  large  user  community  specifically  based  in  the  TCP/IP  environment;  and 
the  Information  Engineering  Task  Force  (IETF)  and  some  of  the  work  which  is  being 
done  by  it. 

I  am  going  to  defer  some  portion  of  the  Z39.50  discussion  in  anticipation  of  the 
Z39.50  panel  to  be  chaired  by  Ray  Denenberg  this  afternoon.  There  are  two  Z39.50 
groups  which  I  think  you  should  be  aware  of:  the  Z39.50  Implementors  Group  (ZIG) 
which  is  focused  on  the  continued  development  of  the  standard  and  on  the  development 
of  inter-operable  systems  using  the  standard.  The  initial  WAIS  system  which  Brewster 
designed  and  put  into  the  public  domain  was  based  on  the  1988  revision  of  the  Z39.50 
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protocol.  I  think  Brewster  would  agree  with  me  that  most  systems  people  would  have 
assessed  that  protocol  as  being  highly  unusable  for  developing  a  well-refined  system.  So 
WAIS  incorporated  certain  extensions  beyond  the  bounds  of  the  protocol  to  make  the 
system  work.  In  1992  version  2  of  the  standard  was  officially  voted  on  and  adopted  and 
pubUshed  by  NISO.  There  is  a  current  draft  3  which  Ray  will  talk  more  about  since  the 
draft  version  hves  on  his  personal  computer.  It  incorporates  some  important  features 
that  are  being  developed  by  ZIG  members  into  their  systems.  ZIG  developers  include 
both  not-  for-profit  and  commercial  ventures,  such  as  AT&T,  OCLC,  and  libraries  such 
as  Penn  State,  and  the  University  of  California,  working  to  develop  interoperable 
systems. 

Z39.50  has  been  criticized  as  being  very  complex  and  difficult  to  implement  and  I 
am  not  going  to  defend  it  as  I  think  that  there  is  some  vahdity  in  the  criticism.  But  there 
are  people  in  this  room  who  I  think  can  tell  you  what  the  impHcations  of  developing  the 
system  are,  going  outside  of  this  community.  Certainly  if  you  are  within  a  library  agency 
it  is  very  important  to  pay  attention  to  what  is  happening  with  the  ZIG  because  the 
library  system  vendors  and  the  library  community  in  the  larger  TCP/IP  Internet  are 
developing  systems  now  which  work  within  the  standard. 

The  other  group  related  to  this  is  a  group  that  the  Coalition  started  called  the 
Z39.50  Interoperability  Testbed  (ZIT)  which  was  a  spinoff  of  the  ZIG.  Clifford  Lynch, 
who  is  the  Director  of  Library  Automation  for  the  University  of  California,  chaired  this 
group.  He  is  interested  in  winding  this  group  down  this  year,  feeling  that  it  has  done 
much  of  what  it  was  set  up  to  do.  It  has  tested  some  of  the  interoperability  between 
client  and  server  developers.  It  has  also  added  some  forward  momentum  to  the 
development  of  the  ZIG  group  in  developing  the  standard. 

I  am  more  closely  associated  with  the  work  of  the  IETF.  The  diagram  I  am  going 
to  use  is  courtesy  of  Chris  Weider  of  Merit.  One  of  the  things  which  has  come  out  of 
the  Archie,  Gopher,  and  other  systems  experience,  is  the  need  for  a  uniform  method  of 
resource  location  and  resource  identification  in  the  network.  The  Archie  experience 
specifically  allowed  the  system  to  go  out  and  use  a  protocol  called  XTP,  tap  into 
archives,  and  build  databases  of  the  directories  of  the  files.  Some  things  were  discovered 
early:  people  move  files  from  one  system  to  another;  sometimes  files  are  renamed  in  the 
process  so  that  the  names,  are  more  descriptive  (in  systems  which  allow  more  descriptive 
names).  If  an  FTP  archive  running  on  VMS  or  an  IBM  mainframe  is  moved  to  a  UNIX 
system  it  makes  sense  to  change  the  name  and  make  it  more  descriptive  because  UNIX 
is  not  limited  to  eight-character  file  names  and  file  types  like  VM/CMS  is.  If  a  file  is 
taken  from  a  UNIX  system  and  put  on  a  VM/CMS  system,  the  file  must  be  renamed  to 
match  the  IBM  file  naming  conventions.  There  also  is  no  way  of  differentiating  between 
files  which  contain  the  same  intellectual  content  but  differ  in  data  format.  This  problem 
exists  not  only  for  Archie,  but  for  other  network  tools  including  Gopher  and  WAIS. 
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The  IETF  developed  an  integrated  information  architecture  program  to  start 
solving  some  of  the  problems  associated  with  tracking  objects  in  the  network.  Resources 
live  in  the  network.  Above  the  resource  is  a  transponder  or  system  which  uses  a 
protocol  that  knows  the  resource  location  and  the  uniform  resource  number  (URN)  or 
uniform  resource  identifier  of  this  object.  For  librarians  and  publishers,  the  identifiers 
are  similar  to  ISBNs  or  ISSNs.  The  transponder  takes  the  data  from  the  servers  and 
puts  it  into  a  database  similar  to  the  database  built  by  Archie.  The  database  consists  of 
URN  to  URL  (uniform  resource  locator)  mappings.  There  is  one  single  URN;  there  can 
be  multiple  locations  where  that  object  resides  in  the  network.  At  the  top  level  there  are 
the  TCP/IP  protocol  suite  and  the  application  protocols  such  as  Gopher,  WAIS,  WWW. 

In  this  example.  Gopher  has  a  URL.  It  knows  the  location  of  an  object  so  the 
Gopher  system  can  bypass  the  resource  location  service  or  the  RLS  and  go  directly  to 
the  URL  directory  or  to  the  domain  name  service  and  then  contact  the  server  which  has 
the  object.  In  the  example,  WAIS  and  WWW  do  not  have  the  locator,  they  only  have 
the  number  of  the  item.  The  WAIS  system  or  the  Worldwide  Web  system  will  contact  a 
resource  location  service  similar  to  the  domain  name  service.  The  Resource  Location 
Service  can  be  run  locally,  it  can  be  run  by  a  third  party,  it  can  be  run  by  a  mid-  level 
service  provider.  The  application  contacts  the  RLS  or  the  resolver  which  then  goes  out 
and  contacts  the  URL/URN  database,  does  a  look-up  on  the  URN  of  the  desired  object, 
finds  the  URL  or  multiple  URLs,  sends  the  data  back  to  the  resolver  which 
communicates  it  to  the  application.  The  application  can  then  go  and  retrieve  the  object 
directly  from  a  server  or  multiple  servers  in  the  network.  The  chief  advantage  of  this 
system  is  that  it  is  highly  automated.  If  an  object  is  moved,  the  transponder  will  send 
data  to  the  central  directory  servers  so  that  it  doesn't  matter  if  objects  move  or  are 
renamed.  The  URL  will  be  changed  as  needed;  URNs  never  change.  This  entire  model 
is  approximately  50  percent  complete.  There  are  four  IETF  working  groups  working  on 
parts  of  the  model.  The  URL  model  is  close  to  completion;  the  URN  model  is  still 
under  development. 
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WAIS  IN  THE  FUTURE 


Brewster  Kahle,  President 
WAIS,  Inc. 


It's  a  real  thrill,  to  be  at  the  Library  of  Congress,  doing  something  that  most  of  the 
people  in  the  Internet  think  of  as  sort  of  underground,  and  mischievous.   The  Internet  is 
sort  of  -  -  they're  so  proud  of  themselves  and  ourselves,  of  being  sort  of  the  underground, 
and  the  Library  of  Congress  is  NOT  part  of  the  underground,  alright?  The  Library  of 
Congress  for  most  people  connotes  a  sense  of  quality,  a  sense  of  endurance,  permanence. 
If  you  actually  get  your  book  accepted  by  that  random  gatekeeper  at  the  Library  of 
Congress,  and  put  into  the  collection,  it's  going  to  last  a  really  long  time,  and  historians 
are  going  to  be  able  to  know  about  it. 

So  I  think  it's  a  great  combination.  What  I  thought  I'd  do  is  say  a  little  bit  about 
WAIS  into  the  future.  Sorry  about  inventing  an  acronym... I'll  try  to  pun  on  it.  But  what 
I  find  interesting  about  this  area,  is  that  very  few  generations  get  to  see  a  technology 
change  in  the  way  people  communicate  with  each  other.  There's  the  book,  there's  the 
telephone,  and  now  we're  seeing  the  electronic  dissemination  of  information:  a 
technology  change  for  how  people  communicate  with  each  other.  All  sorts  of  really  wild 
and  wonderful  things  happen,  when  you  have  one  of  these  things.  Corporations  come 
and  go,  city  structures  change,  basically  what  it  means  to  be  a  human  being,  changes, 
when  we  have  a  new  mechanism  for  connecting  people.  And  what's  interesting  about 
this  revolution,  is  that  it's  an  inclusive  one.  A  lot  of  us,  in  this  room,  are  taking  part  in 
forming  it,  and  trying  to  figure  out  how  it  makes  sense,  what  things  are  right,  what  things 
are  wrong,  and  we're  going  through  the  prototyping  of  what  this  new  generation  will  look 
Uke. 

Well,  really  what  we're  talking  about  mostly  in  all  this  stuff  is  plumbing.  And 
plumbing  --  you  know,  how  does  it  all  connect  up,  does  it  really  go  through,  gopher, 
Z39.50  version  1,  version  2,  it's  all  plumbing.  And  usually  the  only  time  you  care  about 
plumbing,  is  when  it's  backing  up.  What  the  users  really  want  to  know  is,  "what  can  I  get 
to?"  "What  can  I  do  now,  that  impinges  on  my  job?"  And  that's  what  I  hope  to  talk 
about,  not  so  much  what's  out  there  now  -  but  a  little  bit  about  where  we  are  now,  why 
we're  here,  what's  going  right,  why  there  are  this  many  people  at  least  interested  in  the 
area,  and  where  are  we  going. 

I  thought  I'd  try  to  say  a  little  bit  about  where  are  we.  WAIS  on  the  Internet. 
WAIS  on  the  Internet  is  built  on  Z39.50,  version  1,  everybody's  going  to  version  2+,  but 
it's  basically  a  standards-oriented  system,  that  has  gotten  very  widespread  use.  There's 
this  feeling  of  pent-up  demand.  Everybody's  got  something  to  say.  And  lots  of  people 
want  to  be  able  to  find  lots  of  things  that  are  out  there.  There  are,  on  WAIS  itself,  as  of 
six  months  ago,  about  30,000  users,  on  the  Internet,  using  these  information  resources. 
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The  users  were  in  about  33  countries,  pretty  much  North  America,  Western  Europe,  and 
the  Pacific  Rim;  Africa  is  a  basket  case,  we've  not  been  able  to  get  network  connections, 
so  not  a  lot  of  users  are  coming  from  Africa  yet.  And  we're  moving  down  mto  South 
America. 

The  sorts  of  things  that  people  are  able  to  get  at:  One  of  the  handouts  is  an  ancient 
printed  copy  of  what  you  can  get  at  using  WAIS.  Astronomy  images,  not  just  text,  but 
whole  images  of  what  happens  if  you  look  up;  periodical  references,  biology  journal 
contents  -  there's  a  database  in  Finland,  that's  basically  the  table  of  contents  from  lots 
of  different  journals.  There  are  35  databases  from  the  biology  domain  alone,  that  are 
being  served  on  WAIS.  The  Communications  of  the  ACM,  which  is  a  group  that  s  put  a 
pilot  server  up,  of  all  of  the  full  text  of  the  communications  of  the  ACM,  it's  now  out  of 
date  but  it's  showing  publisher  interest,  even  though  it's  still  based  on  no  money 
changing  hands  yet;  speeches  of  Bill  Clinton,  and  press  releases,  are  served;  the  ^ 
Simpsons,  capsules  of  episodes  from  the  Simpsons    basically  we're  fmdmg  that  it  s  not 
just  the  big  boys  that  can  go  and  control  the  printing  presses  in  this  world.  Anybody  with 
something  to  say  can  start  putting  it  out.  And  be  amazed  at  how  many  people  go  out 
and  use  those  resources.  Columbia  Law  Library  catalogs,  the  U.S.  Supreme  Court 
decisions  in  full  text,  some  documents  from  the  Office  of  Technology  Assessment.  So  it 
just  goes  on  and  on.  There  are  450  databases  now,  it's  doubling  about  every  six  months. 


What  we're  looking  for  now  is  to  increase  the  quality.  We've  got  basically  past  the 
technology  showcase  period,  but  now  we're  trying  to  figure  out,  how  do  we  make  this 
stuff  real-  how  do  we  get  commercial  publishers  involved,  and  how  do  we  get  quality 
collections,  like  the  Library  of  Congress,  up  and  easily  accessible  by  lots  of  people. 

Why  are  we  here?  What  are  the  issues  that  have  made  this  sort  of  thing  go  forward? 
A  couple  of  aspects.  Publishing  is  exploding.  Just  the  amount  of  information  people 
want  to  disseminate  is  just  going  through  the  roof.  You  know  these  exponential  graphs: 
the  number  of  journals,  the  number  of  conferences,  and  basically  that  is  creating  a 
pent-up  demand  for  getting  information  out.  The  other  is  on  the  consuming  side. 
People  are  expected  to  know  more  and  more  of  what's  out  there.  I  don't  know  how  many 
meetings  I  go  into,  nowadays,  it  used  to  be  that  you  just  read  the  "Wall  Street  Journal, 
the  "New  York  Times,"  you  could  walk  into  a  meeting,  and  you  got  your  [self]  covered. 
Nobody's  going  to  come  up  with  a  random  event  from  the  day  that  you  don^know  at 
least  something  about.  But  now  you  get  these,  "I  saw  this  on  the  Net,"  or  Did  you  see 
this  thing  in  this  obscure  journal,"  and  you're  sort  of  expected  to  know  more.  Which 
now  means  we  basically  need  all  the  mechanisms  for  finding  the  right  information  for  us, 
without  making  us  have  that  anxiety  attack  and  shaking  when  we  go  home. 
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Another  aspect  is  that  global  communication  is  becoming  the  mechanism  for  dealing 
with  all  information.  And  the  characteristics  of  paper  distribution  just  aren't  making  it. 
We  use  electronic  to  download  it,  and  then  paper  to  view  it,  until  we  actually  have 
screens  that  are  worth  looking  at,  but  the  mechanism  for  distributing  the  information  is 
going  electronic. 

There  are  other  aspects  about  why  we're  here.  Middle  layer  managers,  who  often 
are  the  mechanisms  for  information  dispersal  within  large  organizations,  are  being  fired 
in  droves.  So  we're  getting  a  lot  more  peer  to  peer  communication,  where  engineers  and 
scientists  are  trying  to  find  the  right  stuff,  without  all  the  infrastructure  of  people  to 
move  information  around.  There  are  cost  pressures.  Goverimient  agencies  are  being 
mandated  to  get  their  information  out.  So  if  they  go  out  and  answer  every  phone  call, 
and  have  to  go  and  send  out  some  sheets  of  paper,  they'll  just  go  broke.  So  these  are 
cheaper  ways  of  distributing  information. 

And  another  aspect  of  why  this  is  going  so  well  in  Washington,  is  that  we've  got  a 
new  President,  that  really  loves  this  stuff.  And  boy,  that  just  changes  things  around.  I 
come  from  the  .com  world,  the  commercial  world,  and  I  don't  actually  see  people  paying 
as  much  attention  to  the  president  of  Apple,  within  Apple,  as  I  see  people  paying 
attention  to  the  President  of  the  United  States.  So  when  we  have  a  President  of  the 
United  States  saying,  "let's  go  more  electronic,  and  get  more  of  our  information  out,"  it 
actually  does  make  quite  a  bit  of  a  change  all  the  way  down  through  one  of  these 
organizations. 

OK.  Where  are  we  going.  What  we'll  see  today,  and  I  suggest,  one  of  the  most 
important  things  about  today's  meeting,  is  not  so  much  my  standing  up  here  and 
blathering  along.  It's  the  demonstrations  downstairs  in  the  Atrium.  Get  an  idea  of  what 
this  stuff  really  looks  like,  what  it  really  feels  like,  why  people  actually  can  go  and  use 
this  stuff.  This  is  the  first  time  that  I've  done  anything  as  an  engineer,  that  my  mom 
could  use.  Usually  I  get  that  kind  of  "proud  of  you,  son,  I  don't  know  what  you're  doing, 
but  I'm  sure  it's  important."  But  she  was  actually  able  to  go  in  and  use  recipes,  lyrics, 
poems,  all  kinds  of  things  that  she  was  actually  interested  in.  So  this  isn't  the  sort  of 
frightening,  MIS  department,  "gee  it's  going  to  be  good  for  you,  we  promise,"  kind  of 
thing.  It's  actually  usable,  it's  often  even  fun.  So  please  do  it. 

Where  are  we  going?  The  important  part  is  to  increase  the  quality  of  the 
information  that's  available  on  the  Net.  There  are  two  major  areas  that  we're  pushing 
on  within  WAIS  Inc.  Federal  information  sharing,  and  the  commercial  publishers.  The 
federal  sector  is  a  great  area,  because  they  in  general  don't  charge  for  access  for  things. 
So  there's  a  mechanism,  and  there's  a  community,  that's  trying  to  get  information  out. 
So  that  can  be  the  early  adopters  in  putting  like  the  Library  of  Congress  card  catalog 
out,  or  patents.  To  give  you  sort  of  an   idea,  here's  a  message  that  I  got  on  email 

Friday,  from  somebody  at  ,  that  I've  never  met,  often  you  get  these  sorts  of  random 

messages.  "Brewster:  Of  all  the  WAIS  sources  that  I  use,  the  one  that  I've  found  the 
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most  valuable,  without  question,  is  the  patent  source  that  is  at  Thinking  Machines.  Yes, 
it's  only  two  months  worth,  but  I've  learned  an  amazing  amount,  from  just  those  two 
months.  If  anybody  is  thinking  of  building  and  maintaining  a  real  patent  WAIS  source,  it 
seems  that  this  would  be  relatively  easy,  and  highly  profitable,  even  if  they  charged  one 
tenth  of  what  DIALOG  charges."  OK,  so  this  just  two  months,  that  we  got  from  the 
Patent  Office,  and  put  up  on  an  example  WAIS  server  a  couple  of  years  ago,  and  people 
use  this  stuff  all  the  time. 

I  had  a  good,  day-long  meeting  with  the  Patent  Office  yesterday,  with  a  whole  bunch 
of  different  WAIS  people.  And  they're  inclined  to  try  to  move  forward  with  this.  It 
turns  out  that  the  cost  of  going  and  putting  all  of  the  patents  up,  and  having  thousands  of 
users  access  it,  equipment,  software,  installation,  is  about  $150,000.  Holy  crow! 
$150,000.  One  man-  year.  And  you  could  basically  go  and  set  this  stuff  up,  I  know, 
because  I  sell  them,  and  if  you  want  to  go  and  buy  one,  I'll  sell  you  one.  But  there  are  ^ 
other  places  you  can  go  to  as  well.  The  prices  have  been  dropping,  phenomenally,  so  it's 
not  just  a  mainframe  issue,  with  millions  of  dollars  going  out. 

Another  area  that  we  find  very  interesting  and  useful  is  the  commercial  publisher. 
During  1993,  at  the  end  of  this  year,  we're  going  to  start  bringing  up  the  first  for-pay 
information  sources  on  the  Internet,  using  the  WAIS  technology.  So  that  people  can 
start  using  Z39.50  to  find  what  they  want,  out  of  newspapers,  magazines,  journals,  and 
start  to  pay  for  it.  The  Internet  right  now  is  pretty  much  a  freebie  world.  And  people 
are  kind  of  used  to  that.  But  they're  also  really  clamoring  for  higher  quality.  So  there 
are  those  that  want  the  higher  quality.  And  the  publishers  are  saying,  hey,  there  are  5 
million  people  out  there,  I  can  sell  to,  and  why  don't  I  try  and  do  that.  Now  that's  not  to 
say  that  there  aren't  for-pay  information  services  now,  aheady,  for  instance  DIALOG 
and  Dow  Jones  make  their  information  available.  But  basically  the  same  old  dumb 
terminal  dial-up,  feels  like  a  horrible  mainframe-at-a-distance  kind  of  structure.  But 
DIALOG  has  said  that  when  they  put  this  stuff  up,  it  was  about  six  months  ago,  about 
3%  of  their  users  are  now  coming  over  the  Internet,  instead  of  going  through  the 
commercial  data  providers.  What  WAIS  is,  is  a  more  sophisticated,  easier,  nicer-to-use 
thing. 

The  other  thing  that's  happening  now,  is  we're  getting  multiple  vendors,  coming  in 
and  working  on  the  same  protocol.  We  all  have  this  mantra,  "use  the  protocol,  use  the 
protocol."  It's  the  thing  that  binds  us  together,  even  though  we're  going  to  be  competing 
hke  nuts,  to  go  and  make  the  best  servers,  the  best  chents,  the  best  information 
resources.  So  right  now  commercial  vendors  are  really  coming  into  the  play,  and  we're 
starting  to  see  products  come  out  of  a  lot  of  different  Ubrary  vendors,  and  other  search 
vendors. 

The  fun  stuff.  What  are  the  fun  things  that  are  going  on?  The  things  that  I  find 
most  interesting  about  the  last  six  months,  till  the  next  year  or  two  of  developments  that 
are  going  on,  is  that  we're  getting  multiple  languages.  This  used  to  be  pretty  U.S.-only 
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kind  of  stuff.  It's  gotta  be  written  in  English,  it's  gotta  be  written  in  ASCII.  We're 
starting  to  get  formatted  documents,  and  searching  in  other  languages.  We're  working 
with  a  Japanese  partner,  so  that  Japanese  searching  and  retrieval  can  be  done  innately. 
German,  Spanish,  ItaHan,  it's  all  happening,  it's  all  based  around  standards  like 
UNICODE  or  Microsoft  Word,  for  bringing  back  a  document  in  Russian,  that  you  can 
go  off  and  read. 

Multimedia,  we  have  images  of  pages  already;  we've  got  astronomy  images;  and 
starts  of  Uttle  dinky  videos  and  audio  feeds,  that  are  starting  to  be  moving  around,  on  the 
-  Internet,  through  WAIS. 

We're  finding  that  developing  countries  are  starting  to  participate,  and  the 
developing  countries,  isolated  countries,  the  countries  that  are  really  using  this 
technology  the  most,  in  my  experience,  are  the  ones  that  feel  like  they're  left  out  in 
outfield  somewhere:  AustraUa,  Singapore.  They  feel  like  they've  been  kind  of  left  out, 
because  it's  too  far  away.  And  they  say,  "Ah,  this  is  our  mechanism  of  participation."  So 
Australians  are  using  WAIS  harder  than  any  other  country  other  than  the  United  States. 
Singapore  is  setting  itself  up  to  be  a  very  major  information  country  in  the  biology 
domain.  So  that's  kind  of  neat. 

And  we'd  like  to  get  it  to  more  and  more  of  the  third  world  countries,  through 
network  connections  often  going  through  sateUites,  through  things  like  diplomatic 
channels,  and  then  using  that  as  mechanisms  for  setting  up  Internet  resources. 

The  copyright  issues  are  being  solved.  The  best  way  to  solve  the  copyright  issue  is 
for  people  to  start  making  money,  and  not  losing  their  shirt.  PubUshers  are  fairly  easy  to 
predict,  like  capitalists  in  general,  right.  They  want  to  make  sure  they  have  a  job  next 
year,  and  hopefully  they'll  be  paid  a  little  more  next  year  than  they  were  last  year.  So 
making  a  mechanism  for  them  to  get  paid,  and  mechanisms  for  protecting  their 
intellectual  property,  are  being  solved.  It's  not  going  to  be  the  end-all,  but  it's  going  to 
be  good  enough,  to  make  it  so  that  the  Internet  is  not  a  frightening  place. 

Enhanced  security  systems  are  all  part  of  this.  Because  it's  not  only  for  billing,  but 
you  also  want  to  make  some  databases  only  available  for  certain  people.  We  built  an 
information  system  for  the  Perot  campaign,  which  was  great  fun,  and  there's  all  sorts  of 
database  restrictions  on  access.  You  wanted  some  information  to  go  out  to  all  of  the 
satellite  offices,  but  there  were  some  things  that  you  wanted  to  just  have  particular 
people  to  be  able  to  get  to.  So  those  are  now  being  built  into  the  systems  in  a  variety  of 
different  ways. 

So  yes,  this  is  a  technology  that  is  real,  it's  being  used  by  lots  of  people.  I 
recommend  you  get  a  real  feel  for  it. 
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So  what?  My  last  slide.  What's  going  on  is  this  information  dissemination  approach 
is  cheap,  fast,  global,  and  personal.  This  technology  has  dropped  in  cost  phenomenally. 
The  cost  of  going  and  serving  a  set  of  books,  is  on  the  order  of  $5000  for  the  hardware 
and  about  $5000  for  the  software,  and  you  can  serve  gigabytes,  a  thousand  books,  to  a 
worldwide  population  of  thousands  of  users  coming  at  you,  for  $10,000.  These  sorts  of 
things  just  weren't  possible  when  we  printed  things  on  slivers  of  dead  trees.  Electrons 
are  cheap,  and  the  Internet  is  an  infrastructure  that  we  can  leverage.  When  people  say 
well,  "we're  required  to  do  cost  recovery,  for  the  incremental  cost  for  distributing  our 
information."  I  mean,  well  gosh,  what's  the  incremental  cost  of  most  of  these  WAIS 
servers?  The  closest  I  can  come  to  is  a  peimy  per  copy,  is  the  incremental  cost  for 
spinning  your  disk,  or  going  and  running  your  computer.  Dirt  cheap. 

Basically,  this  is  a  mechanism  to  do  everything  --  the  information  distribution  of  our 
dreams  and  information  gluttons'  mechanism  for  getting  at  this  stuff.  It's  fast.  The 
Internet  stuff,  you  can  download  a  book  in  about  10  seconds.  Pretty  good!  The  sorts  of 
things  I  was  seeing,  demonstrations  at  the  Patent  Office,  you  were  flashing  up  pages  of 
scanned  images  of  pages  'cause  we're  starting  to  avoid  ASCII  these  days;  you  scan  the 
images  of  the  pages,  run  it  through  optical  character  recognition,  search  based  on  the 
ASCII  and  pull  back  pictures  of  the  pages.  Basically,  a  mechanism  for  retrospective 
conversion  of  paper.  It  drops  the  cost  of  putting  paper  into  a  computer  to  close  to  xerox 
costs.  So  this  is  a  mechanism  for  moving  forward  with  preserving  pictures  and  all  that 
kind  of  thing.  And  the  speed  we  were  seeing  was  sort  of:  Flash,  Flash,  Flash,  Flash,  Flash 
for  pages.  You  could  flip  through  pages  that  fast  with  workstations.  Of  course,  those  of 

us  who  are  poor  Macintoshes  and  things,  it's  F_L_A_S_H  F_L_A_S_H  But,  it's 

getting  there,  OK.  So,  right  now  those  of  us  on  poor  Macintoshes  and  PCs  stay  with 
ASCII  and  Microsoft  Word  but  those  on  workstations,  you  know,  get  this  sort  of  speed 
and  it  will  be  a  year  or  two  before  scanned  images  of  pages  on  any  old  workstations  will 
do  fine. 

Another  fact:  it's  global.  We're  already  seeing  lots  of  participation,  from  lots  of 
different  countries.  And  a  key  one,  is  it's  personal.  This  stuff  is  driven  by  the  reader. 
Instead  of  the  publisher  going  and  saying,  "you've  got  to  see  this,"  this  is  oriented  for  the 
reader.  It's  what  I  want  to  see.  We're  setting  up  the  better  mechanisms  for  filtering 
things.  How  does' WAIS  fit  into  all  of  these  systems?  We  get  a  lot  of  debates  on  the, 
gopher  vs.  WAIS,  and  Veronica  vs.  something  else,  and  all  these  things.  What  WAIS  is, 
is  a  mechanism  coming  from  the  publisher,  towards  the  user.  The  publisher  or  the 
information  resource.  It's  a  big  search  and  retrieval  approach. 

The  idea  is  to  have  publishers  put  their  information  up  once,  with  one  protocol  that's 
an  international  standard,  and  have  all  those  people  out  there  go  out  and  make  better 
user  interfaces  for  getting  at  it.  So,  you'll  get  the  gophers,  the  World  Wide  Webs,  you'll 
get  gateways  into  Compuserve,  America  Online,  you'll  get  to  lots  of  different  user 
communities,  by  going  and  pubUshing  your  information  once,  on  the  Internet,  using  a 
standard  protocol. 
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That's  what  WAIS  is  trying  to  do.  It's  moving  from  the  publisher  or  the  information 
provider  out,  and  there  are  lots  of  people  going  the  other  way.  And  frankly,  the 
interfaces  that  we've  got  now  all  stink,  and  the  better  ones  are  really  coming  out  of 
places  like  XEROX  PARC,  and  Apple  Computer,  NEC  Computer,  Microsoft  --  that 
really  understand  how  to  build  user  interfaces.  Are  any  of  those  available?  No,  not  yet. 
Some  of  them  are  in  development,  some  are  being  now,  some  of  them  don't  even  exist. 
But  those  are  the  environments,  that  the  best  user  interfaces,  I  suggest,  will  come  from. 

What  we  as  the  federal  government,  commercial  publishers,  and  libraries,  should  just 
do  is  make  it  possible  for  people  to  access  our  holdings.  And  luckily,  the  stuff  is  cheap. 
So  why  is  it  important?  We  started  early.  WAIS  jumped  in  this  area  before  there  were 
any  standards,  really,  for  widespread  access.  We  picked  Z39.50  because  of  its 
committee,  rather  than  its  standard.  The  standard  when  we  started  on  it  was  just 
basically  for  librarians,  for  finding  MARC  records,  card  catalog  records,  and  frankly,  our 
users  couldn't  care  less  --  they  wanted  images,  they  want  full  text  documents,  they  want 
browsing,  they  want  to  search  lots  of  different  servers  at  once,  and  that  really  wasn't 
what  Version  1  was  about,  but  they  said,  "we'll  swing  with  you,  we'll  go  and  figure  out 
what  it  is  you  need,  and  help  put  it  in  the  process." 

Version  2  is  basically  making  Z39.50  compliant  with  the  Europeans,  the  ISO 
standard,  and  putting  in  some  of  the  features  that  we  needed  for  full  text  type  things,  and 
Version  3  is  hopefully  going  to  be  what,  if  you  pick  up  the  standard  and  implement  it, 
everything  will  interoperate. 

So  we're  all  completely  concerned  with  interoperability,  we're  all  implementing 
Version  2,  and  we  keep  that  mantra,  "use  the  protocol."  And  there  are  other  pieces  of 
the  protocol  that  are  coming  along  -  the  URLs,  URNs,  security  systems,  billing 
standards,  data  exchange  formats  like  SGML,  are  all  being  basically  wrapped  together 
into  a  working  whole.  One  of  the  words  that  is  commonly  used  for  this  working  whole,  is 
WAIS.  But  it's  completely  dependent  on  all  of  the  people,  a  lot  of  the  people  in  this 
room  have  spent  years  building  clients,  distributing  them  often  for  free,  some  of  them 
are  starting  to  be  employed  in  this  general  area.   We're  basically  starting  an  industry. 

What  I  hope  you  get  out  of  today  is  an  idea  of  what  WAIS  is,  what  it  isn't,  get  an 
idea  of  getting  some  hands  on  stuff,  downstairs,  of  how  hard  is  it  to  go  and  build 
information  sources,  and  get  them  out,  how  hard  is  it  to  use  the  stuff  that's  out  there, 
and  I  hope  you  go  off  and  see  that  it's  not  just  a  technology  that's  stopping  now,  but  that 
it's  growing.  Thank  you. 
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FEDERAL  INFORMATION  LOCATOR 

Eliot  Christian 
USGS 


What  is  a  locator?  First,  it's  electronic.  So  for  those  in  our  society  who  do  not 
have  access  to  electronic  information,  they  must  be  served  on  a  vi'alk-in  basis  or  via 
surface  mail  by  information  speciaUsts,  such  as  librarians,  who  do  have  access  to  it.  The 
locator  we  are  talking  about  then  is  an  electronic  mechanism.  It  locates  inforination,  not 
in  the  sense  that  librarians  think  of  a  locator  which  tells  you  where  something  is 
physically  so  you  can  go  fetch  it,  but  in  the  sense  of  an  entry  that  says  that  the  USGS  has 
an  archive  of  all  of  the  Landsat  photographs  and  they  are  resident  at  the  AIROS  (?) 
Data  Center  and  here  is  how  you  can  get  hold  of  them.  That  is  what  we  are  calling  a 
locator.  You  might  also  hear  the  term  metadata  when  the  holdings  are  data.  You  might 
also  hear  the  term  meta-information. 

The  locator  has  very  high  level  descriptive  information.  You  might  think  in  terms 
of  a  couple  of  hundred  entries  per  agency;that  is  about  how  many  we  have  to  describe 
the  holdings  of  the  USGS,  for  example.  Government-wide  you  are  talking  about  10,000 
entries.  It  is  high  level  information  and  it  is  not  the  information  itself  except  in  rare 
limiting  cases  such  as  the  Consumer  Price  Index,  which  by  the  time  you  have  information 
about  it,  you  in  fact  have  the  information  itself.  That  is  not,  of  course,  the  case  with 
something  like  the  Landsat  archives.  So  it  is  pointers  to  the  information  and  not  the 
information  itself.  It  is  all  electronic.  It  is  high-level  information. 

Once  again  in  the  federal  government  we  are  trying  to  make  government 
information  accessible  to  the  public.  There  is  something  called  the  Federal  Information 
Inventory  Locator  System  or  FIILS  which  was  mandated  by  law  several  years  ago.  It  was 
what  was  classical  several  years  ago:  it  was  a  big  centralized  system.  All  the  agencies 
were  required  to  feed  into  it.  It  did  not  succeed.  The  reasons  why  it  did  not  succeed 
are  very  well  documented  in  a  study  that  the  Office  of  Management  and  Budget, 
National  Archives,  and  the  General  Services  Administration  commissioned.  The  study 
was  done  primarily  by  Dr.  Charles  McClure  of  Syracuse.  It  basically  said  that  the 
problem  is  that  you've  cut  away  the  feedback  loop:  people  are  expected  to  feed  into  it, 
but  they  get  no  immediate  value.  The  one  centralized  system  is  not  what  agencies  use  to 
manage  their  own  business,  so  it  is  not  well  supported. 

The  Office  of  Management  and  Budget  issued  a  press  release  with  Circular  A- 130 
which  Peter  Weiss  talked  about  earher.  The  press  release  stated  that  OMB  is  committed 
to  promoting  the  establishment  of  an  agency-based  government  information  inventory 
locator  system.  The  change  from  federal  to  government  is  not  significant,  but  the  change 
to  agency-based  is.  What  is  now  being  said  is  that  we  realize  that  a  centralized  system 
does  not  work;  if  you  can  instead  leverage  off  of  the  existing  stuff  that  agencies  use 
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anyway  you  can  do  a  lot  more  and  you  can  hope  that  it  will  be  maintained  because  you 
are  leaving  the  maintenance  of  the  information  with  those  people  who  care  most  about 
it,  a  basic  principle  of  good  management. 

Let  me  jump  into  a  Gopher-view  of  fed-space--an  organization  chart.  There  is  a 
thing  you've  heard  a  little  about  already  called  the  National  Information  Infrastructure. 
That  term  is  awfully  broad.  There  is  an  article  this  month  in  "Scientific  American"  called 
"Domesticating  Cyberspace"  which  gives  you  a  sense  of  it.  The  article  talks  about  high 
definition  television,  movies  on  demand,  optical  fiber  to  the  home,  rolling  back 
divestiture.  That  is  not  what  we're  talking  about;  that's  the  parent.  Out  there 
somewhere  is  the  National  Information  Infrastructure.  There  will  be  legislation.  Within 
the  whole  infrastructure  there  is  a  piece  that  is  the  responsibility  of  the  federal 
government.  There  is  a  piece  of  that  that's  within  the  Executive  Branch  which  is  where 
the  USGS  as  an  agency  resides.  The  Executive  Branch  has  structured  a  group  called  the 
Information  Infrastructure  Task  Force  referred  to  briefly  earlier.  It  is  chaired  by  the 
National  Economic  Council.  That  should  tell  you  something  right  there;  this  is  a  macro- 
economic  view  of  information  infrastructure,  a  little  different  than  straight  information 
dissemination  or  how  information  can  enhance  democracy,  but  valuable  in  its  own  right. 

Within  the  Information  Infrastructure  Task  Force  three  committees  have  been  set 
up.  The  Department  of  Commerce  heads  the  Telecommunications  Policy  Committee 
which  will  focus  on  issues  such  as  the  allocation  of  bandwidth,  divestiture,  the  telco 
issues.  There  is  another  group  called  the  Committee  on  Applications  which  is  concerned 
with  the  High  Performance  Computing  Communications  initiative,  funding  specific 
research  activities  which  would  then  have  fall-out  and  be  part  of  the  information 
infrastructure  in  terms  of  doing  the  basic  research  so  we  understand  how  to  do  build 
these  things. 

The  committee  I  want  to  talk  about  further  is  the  Information  Policy  Committee. 
It  is  headed  by  Sally  Katzen,  who  is  the  new  Office  of  Management  Budget 
Administrator  for  the  Office  for  Information  and  Regulatory  Affairs.  The  Information 
Policy  Committee  splits  out  into  three  group:  one  is  concerned  with  security  and  privacy, 
securing  the  rights  of  citizens  to  not  have  their  privacy  violated  by  the  federal 
government;  a  second  is  concerned  with  intellectual  property,  the  copyright  issues;  the 
third  is  concerned  with  federal  information  dissemination.  I  think  it  is  important  that  of 
all  of  the  issues  which  could  be  dealt  with  under  information  policy,  this  administration 
sees  getting  the  government's  own  information  out  to  its  people  as  one  of  the  highest 
level  breakouts.  They  have  said,  and  you  will  see  in  Sally  Katzen's  press  release,  that 
there  are  three  things  that  they  are  going  to  try  to  do,  and  I  will  mention  a  fourth  one. 
The  three  things  that  they  are  going  to  try  to  do  are:  improve  email  among  the 
agencies-many  of  you  who  are  Internet  surfers  will  be  surprised  to  find  that  the  vast 
majority  of  the  federal  government  does  not  do  email  and  even  when  we  do  we  often  use 
something  called  X.400  which  is  archaic;  there  is  another  thing  which  we  are  trying  to 
accomplish  within  the  federal  government  that's  been  in  law  for  a  long  time,  to  get  more 


26 


and  more  of  the  paper  converted  over  to  electronic  form,  things  like  your  IRS  forms  or 
especially  the  booklets  which  tell  you  how  to  fill  out  the  IRS  forms. 

The  other  one  is  to  promote  the  estabhshment  of  the  government  locator  system. 
The  tasks  will  in  all  likelihood  have  guidance  from  OMB  to  the  agencies,  typically  in  the 
form  of  an  OMB  bulletin.  Although  we  may  also  get  higher  level  guidance  because 
there  is  legislation  now,  the  Paperwork  Reduction  Act  for  example  has  very  specific 
things  to  say  about  how  we  do  a  government  locator.  We  will  build  and  operate  a 
prototype  locator.  I  think  we  already  see  the  beginnings  of  that  with  a  lot  of  the 
WAIS-based  government  information  that's  already  out  there,  plus  things  like  FedWorld 
which  is  a  bulletin  board  approach,  and  some  CD-ROM  locators.  We'll  have  to  come  to 
some  agreement  about  metadata  standards.  All  that  really  means  is  that  if  you  are  going 
to  describe  holdings,  there  must  be  some  common  handles.  For  example,  maybe  title, 
abstract,  and  cost  would  be  the  common  descriptors.  In  fact,  in  one  of  the  bills  before 
Congress,  these  are  the  elements  mentioned  as  required  in  the  locator.  There  may  be 
some  additional  items  needed,  but  what  is  important  is  to  come  to  some  agreement 
about  a  common  set  of  elements.  That  is  what  will  be  put  out  so  that  all  agencies  will 
publish  their  information  that  way,  in  addition  to  other  ways  that  they  pubhsh  for  use  m 
their  own  communities. 

As  I  said,  we  are  going  to  build  on  work  already  underway,  but  it  is  very 
important  that  we  get  lots  more  involvement  of  other  inter-agency  committees  that  are 
already  in  operation  such  as  the  Z39.50  implementors'  group  which  will  be  talking  to  you 
next.  In  the  area  of  the  metadata  agreement,  we  need  to  decide  on  our  goals  and 
approach,  our  primary  and  secondary  communities.  It  turns  out  that  if  you  view 
metadata  from  an  archivist's  viewpoint  things  like  the  technical  contact  for  how  to  get 
information  aren't  really  relevant  because  if  you  are  thinking  in  terms  of  one  hundred 
years  later,  why  would  anyone  need  the  name  of  the  person  responsible  for  it.  It's  a 
different  perspectives  depending  on  what  you  are  using  the  data  for.  We  asked  who  is 
our  primary  community.  Someone  said  it  is  everyone.  Just  literate  people?  Just  Enghsh 
speakers?  Somebody  else  said  K-12  too.  So  we  are  going  to  try  to  accommodate, 
understand,  and  be  expUcit  about  our  different  user  communities,  interests,  and  ^ 
capabilities.  Again,  it  is  important  that  we  pay  a  lot  of  attention  to  those  who  aren  t 
already  part  of  the  electronic  age.  We  are  going  to  get  a  document  out  for  comments 
from  as  many  people  as  we  can  get,  talking  about  how  we  are  going  to  do  the  metadata. 
We  want  a  lot  of  public  involvement  in  that.  Again,  re-  emphasizing  that  input  from  the 
affected  communities  is  going  to  be  the  key  to  having  a  locator  that  gets  well  accepted. 

As  a  sidelight,  there  are  a  lot  of  folks  who  make  money  by  fbdng  the  fact  that  the 
government  does  not  do  a  good  job  of  providing  access  to  its  information.  In  some 
sense  we  are  competing  with  them  so  we  have  to  make  sure  that  we  understand  what 
people's  agendas  are  and  where  they  are  coming  from  as  we  move  into  this.  It  is  an 
exciting  thing  to  do,  a  bit  of  a  minefield,  but  we  are  going  to  be  moving  out  on  it  sharply. 
I  invite  you  all  to  get  involved  to  whatever  extent  you  can. 
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Z39.50:  FREQUENTLY  ASKED  QUESTIONS  SESSION 
PANEL  MEMBERS: 

Ray  Denenberg  (Library  of  Congress/NetDev) 

Jim  Fullton  (CNIDR) 

Bob  Waldstein  (AT&T) 

Les  Wibberley  (Chemical  Abstracts) 

Ralph  LeVan  (OCLC) 

RAY  DENENBERG:  BACKGROUND  QUESTIONS 

1       What  is  Z39.50?  What  is  Z39.50  NOT? 

2.       What  is  a  client  and  server?  Origin  and  target?  What  is  a  Z39.50  chent  and 

server?  WAIS  client  and  server? 
3       How  do  Z39.50,  TCP,  OSI,  and  WAIS  fit  together? 

4*       How  was  Z39.50  developed?  What  are  the  different  versions?  What  groups  are 
■       participating  in  its  ongoing  development?  Is  there  international  participation? 

Regarding  mie.tions  1  and  2.  in  terms  of  the  client/sever  model:  if  you  start  with 
a  basic  hypothetical  information  retrieval  application  and  assume  that  you  view  it 
logically  as  divided  into  two  components,  a  user  application  and  a  database  engine  the 
user  application  takes  commands-from  the  user  and  formulates  those  commands  into 
queries  that  are  understandable  by  the  database  engme,  and  the  database  engine 
acce    s  the  database  and  formulates  the  results  into  a  response  that  --^^^^'^f^^ 
to  the  user  application.  The  key  is  that  the  two  components  understand  each  other  If 
Tou  then  >^ew  thL  in  terms  of  the  client/server  model,  you  have^a  client  system  and  a 
™ysrem.  You  can  consider  the  user  application  to  be  the  client.  Jo^^^an  consuler 
L  database  engine  to  be  the  server.  The  two  are  split  across  two  systems,  with  the  line 
representing  inter-system  communications. 

The  client/server  model  as  represented  is  fine  as  long  as  the  client  and  server 
understand  each  other  even  though  they  are  split  across  ^wo  f  stems.  It  s  not  n 
to  this  model  that  they  be  split  across  two  systems;  they  could  be  on  the  same  system  and 
be  different  processes,  in  which  case  the  line  would  represent  inter-  process 
commutations.  But  We  like  to  think  of  them  as  split  across  two  systems  so  we  can  talk 
about  them  as  a  cUent  system  and  a  server  system. 

We  want  to  address  the  case  where  the  client  system  and  the  server  system  don't 
understand  each  other,  because  we  may  want  to  have  client  ^y^^ems  wkc^^^^^^  ' 
variety  of  server  systems,  servers  which  are  accessible  to  a  vanety  of  different  clients. 
XVLeTed  i7some  translation  capability.  The  key  is  that  both  the  chent  and^-^r 
translate  into  a  common  format  and  that  only  one  instance  of  the  translation  capability 
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must  be  implemented.  The  translation  capability  which  we  are  talking  about  is  Z39.50. 
Z39.50  is  the  protocol  for  communications  between  an  information  retrieval  client  and 
an  information  retrieval  server.  Z39.50  is  a  specification  which  is  divided  into  client-  and 
server-  like  components  referred  to  as  the  origin  and  the  target.  The  origin  is  interfaced 
with  the  client;  the  target  is  interfaced  with  the  server.  The  Z39.50  origin  and  Z39.50 
target  communicate  with  one  another  according  to  the  Z39.50  protocol. 

Once  the  Z39.50  protocol  is  introduced  into  the  client/server  model  discussion, 
then  you  must  be  careful  about  the  use  of  the  terms  client  and  server  because  people  use 
the  terms  client  and  server  to  mean  the  client  system  and  the  server  system,  or  the  client 
apphcation  and  the  server  application,  or  the  client  application  together  with  Z39.50 
origin  and  the  server  application  together  with  the  Z39.50  target.  So,  in  order  to  answer 
the  question  of  what  is  a  Z39.50  client  and  server,  the  context  must  be  known.  If  you 
really  mean  the  Z39.50  origin  and  Z39.50  target,  it  is  probably  best  to  use  those  two 
terms. 

I'll  return  to  this  discussion  later,  but  first  I  want  to  address  the  second  part  of 
question  1:  what  is  Z39.50  NOT?  I  may  not  be  able  to  answer  this  fully,  but  it  is  just  as 
important,  or  maybe  more  important,  as  the  answer  to  the  question  of  what  Z39.50  IS. 
For  one  thing,  I  mentioned  that  there  was  one  element  missing  from  the  previous 
example,  and  that  was  the  translation  capabihty.  In  other  words,  the  client  application 
has  to  translate  to  the  common  representation;  and  the  server  application  has  to 
translate  to  a  form  that  is  understood  by  the  Z39.50  target.  That  can  be  quite  a  bit  of 
work.  That  aspect,  the  translation,  is  NOT  part  of  the  Z39.50  standard.  The  standard 
only  addresses  the  protocol  for  communication  between  the  target  and  the  origin. 

A  couple  of  other  things  that  Z39.50  is  NOT.  It  is  not  a  user  interface  or  a 
command  language.  It  does  not  specify  any  dialogue  between  the  user  and  the  user 
application.  It  is  not  a  database  management  system;  it  is  not  a  database. 

Back  to  the  question  of  the  use  of  the  terms  client  and  server.  If  you  want  to 
refer  to  a  WAIS  client  and  a  WAIS  server  that  is  fine.  WAIS  incorporates  Z39.50,  so 
you  can  think  of  the  WAIS  client  as  being  the  application  together  with  the  Z39.50 
origin  or  the  WAIS  server  as  incorporating  the  Z39.50  target. 

That  covers  questions  1  and  2.  Ouestion  3  covers  the  relationship  among  Z39.50, 
TCP,  OSI,  and  WAIS.  OSI  provides  a  seven  layer  reference  model.  That  reference 
model  is  use,d,  among  other  things,  for  the  description  of  protocols.  Z39.50  is  an 
application  protocol,  which  means  that  it  provides  direct  support  to  the  end  application. 
So  Z39.50  is  at  layer  7,  which  is  the  application  layer.  At  layers  3  and  4  is  TCP/IP. 
TCP  and  IP  are  two  distinct  protocols  which  reside  within  the  OSI  framework  at  layers  3 
and  4.  Almost  all  of  the  implementations  today  run  Z39.50  directly  over  TCP/IP, 
However,  there  are  alternative  OSI  protocols  that  are  equivalent  in  functionality  but  not 
compatible  with  TCP  and  IP. 
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At  layer  6  where  nothing  is  pictured  and  at  layer  7  there  are  two  additional 
protocols  which  merit  brief  mention.  The  reason  being  that,  as  the  Z39.50  applications 
continue  to  evolve,  we  expect  the  two  protocols  which  I  am  about  to  describe  to  become 
increasingly  important.  At  layer  7  there  is  what  is  called  the  association  control  protocol 
which,  if  used,  would  be  used  in  conjunction  with  Z39.50  at  the  application  layer.  What 
it  is  used  for  is  to  specify  application  context;  the  TCP  analogy  is  to  use  well-known 
ports.  Many  of  us  doubt  that  for  the  long  term  that  that  will  hold  up,  to  be  able  to 
develop  Z39.50  for  multiple  applications  on  a  single  host.  Also,  the  presentation 
protocol  which  I  won't  discuss  extensively  except  to  say  that  people  may  begm  to  reahze 
that  it  is  necessary  to  be  able  to  negotiate  the  different  syntaxes  to  be  used  durmg  a 
session.  Les  will  talk  more  about  syntaxes  which  is  very  important  element  in  Z39.50, 
the  ability  to  represent  the  different  types  of  record  syntaxes  which  must  be  dealt  with 
during  a  session. 

To  complete  the  OSI  or  reference  model  picture:  Typically  an  application  is 
pictured  to  reside  at  the  top  of  layer  7.  So,  if  you  consider  Z39.50  an  integral  part  of 
WAIS,  you  would  picture  WAIS  as  an  appUcation  residing  part  in  layer  7  and  part  above 
the  seven-layer  reference  model. 

On  to  question  4:  There  is  Z39.50  which  is  an  American  National  standard;  there 
is  a  corresponding  and  compatible  international  standard  called  Search  and  Retrieve 
(SR)   Z39  50  was  first  balloted  in  1984;  that  ballot  failed.  It  was  re-balloted  in  1987;  the 
1988  version  was  approved  by  ANSI  in  1988.  In  parallel,  the  ISO  search  and  retrieve 
(SR)  protocol  was  introduced  in  1984.  It  was  approved  in  1991.  So  though  there  was  a 
strong  attempt  to  coordinate  these  two  protocols,  because  of  the  timing,  by  the  time  the 
SR  protocol  was  approved  there  were  incompatibilities  between  it  and  the  1988  Z39.50 
version.  There  was  an  effort  to  aUgn  these  two  protocols.  Version  2  which  was 
approved  in  1992  was  more  than  an  attempt  to  align  Z39.50  with  SR;  it  provides  a 
number  of  features  beyond  SR.  Z39.50-1992  is  a  compatible  superset  of  SR;  it 
incorporates  features  that  the  implementors  demanded  be  put  into  Z39.50  in  order  to 
make  it  economically  viable  for  them  to  implement.  In  1990,  the  ZIG,  the  Z39.50 
Implementors  Group,  was  established,  and  the  Library  of  Congress  was  designated  as  the 
maintenance  agency.  These  are  two  independent  groups  which  work  very  closely 
together.  The  maintenance  agency's  immediate  mandate  was  to  produce  version  2  and 
to  achieve  compatibility  with  SR.  Once  that  was  accomplished  in  1992,  the  maintenance 
agency  in  close  coordination  with  the  ZIG  began  work  on  version  3  which  is  projected  to 
be  available  in  1994. 
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JIM  FULLTON:  Z39.50,  WAIS,  AND  OTHER  TOOLS 

5.  What  is  the  "WAIS  protocol"?  Is  there  such  a  thing?  Is  there  a  "WAIS 
specification"?  What  is  the  process  for  developing  the  WAIS 
protocol/specification?  Is  there  a  process  analogous  to  the  Z39.50  process? 

6.  What  is  the  relationship  between  WAIS  and  Z39.50?  The  difference?  Can  WAIS 
be  described  as  an  application  of  Z39.50? 

7.  What  is  the  relationship  among  the  versions  of  Z39.50  with  respect  to  the  versions 
of  WAIS?  Which  Z39.50  versions  are  compatible  with  which  WAIS  versions? 
What  are  the  long-term  implications? 

8.  What  capabilities  do  you  gain  by  using  WAIS  instead  of  Z39.50?  What  capabilities 
do  you  lose? 

9.  What  are  the  relationships  among  Z39.50,  WAIS,  WWW,  Gopher?  How  does 
Gopher  use  WAIS?  How  does  Gopher  use  Z39.50? 

10.  How  and  when  will  these  various  tools  be  harmonized? 

11.  How  do  you  use  Z39.50  and/or  WAIS  to  discover  Z39.50  WAIS  servers?  Can 
Z39.50  be  used  to  build  a  server  navigation  tool  to  discover  and  navigate  among 
resources? 

12.  What  are  the  hardware/software  components  that  make  up  a  WAIS/Z39.50 
system?  What  development  tools  are  available?  How  are  WAIS  and  Z39.50 
integrated? 

Regarding  question  5.  "What  is  the  "WAIS  protocol"?  Is 
there  such  a  thing?"  I  like  to  use  the  term  WAIS  protocol  suite  instead  of  WAIS  protocol 
because  there  is  more  to  it  than  just  one  particular  thing;  there  is  a  collection  of 
communication  standards  and  document  identification  entities  that  are  built  into  it  that 
make  it  work.  WAIS  consists  of  the  Z39.50  protocol  engine  version  1  along  with  some 
additional  items  which  are  required  to  have  an  information  pubUshing  system.  Just 
having  a  protocol  by  itself  is  useless;  you  have  to  have  a  lot  of  additional  stuff  buih 
around  it  that  actually  lets  you  describe  the  way  documents  are  presented  to  the  user, 
that  lets  you  locate  documents  and  data  objects  on  the  server,  and  things  of  that  sort.  So 
I  like  to  talk  about  a  suite  of  protocols,  and  one  of  those  is  Z39.50. 

Is  there  a  "WATS  specification"?  The  best  way  to  describe  the  WAIS  specification 
is  the  WAIS  source  code;  if  your  code  works  with  that  code,  then  you've  followed  the 
WAIS  specification.  There  are  some  documents  which  describe  some  of  the  various  data 
structures  and  components  that  are  associated  with  the  Z39.50  protocol  engine  in  WAIS 
and  those  are  available.  The  actual  description  of  the  bit  streams  that  go  across  the 
network  is  not  on  paper  or,  if  it  is,  I  am  not  aware  of  it. 
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What  is  the  process  for  developing  the  WAIS  protocol/  specification?  Is  there  a 
process  analogous  to  the  Z39.50  process?  The  WAIS  protocol  and  specification  was 
developed  when  Brewster  put  the  project  together.  They  designed  what  they  wanted  and 
implemented  it  and  came  out  with  what  we  know  and  love  today.  There  isn't  a 
standards  process  analogous  to  the  Z39.50  process.  There  isn't  an  implementor's  group 
or  a  standards  body  which  meets  and  approves  what  is  being  developed  in  the  WAIS 
world. 

Ray  has  answered  question  6:  What  is  the  relationship  between  WAIS  and 
Z39.50?  Z39.50  is  the  protocol  component  of  the  WAIS  system.    Can  WAIS  be 
described  as  an  application  of  Z39.50?  I  would  say  yes. 

What  is  the  relationship  among  the  versions  of  Z39.5Q  with  respect  to  the  versions 
of  WAIS?  These  are  apples  and  oranges  approaches.  Z39.50  is  the  underlying  protocol 
engine  which  underlies  the  other  components  required  for  a  functioning  information 
system,  a  functioning  application.  The  current  freeware  release  of  WAIS  is  based  on 
Z39.50  version  1  and  the  work  we  are  now  doing  is  to  try  to  attach  a  Z39.50  version  2 
protocol  stack  to  that.  It  is  not  a  replacement  for  the  version  1  stack  but  it  is  an 
extension  of  it. 

Which  Z39.50  versions  are  compatible  with  which  WAIS  versions?  That's  a 
loaded  question  because  regardless  of  which  systems  you  use  it's  possible  to  put  together 
gateways  which  allow  one  system  to  interact  another  system.  Ralph  will  discuss  this 
further.  Right  now  WAIS  is  Z39.50  version  1-compliant.  Systems  which  are  compliant 
with  Z39.50  version  2  will  not  directly  communicate  with  WAIS  servers  yet  but  that  is  a 
problem  which  is  being  worked  on.  I  anticipate  a  fully  compatible  version  2  in  the  near 
future.  I  can  speak  for  CNIDR  and  Brewster  is  nodding  his  head  that  WAIS  is  doing 
something  similar. 

What  are  the  long-term  implications?  There  are  some  very  important  long-term 
implications.  When  everything  is  version  2  or  version  3  compliant,  WAIS  systems  as  well 
as  other  Z39.50-compliant  information  systems  will  be  able  to  interoperate  at  least  at  a 
basic  level.  If  you  have  a  cUent  that  is  for  information  system  A  and  a  server  that  is  for 
information  system  B,  then  they  will  be  able  to  talk  to  each  other  so  that  you  do  get 
some  general  inter-  operability  between  systems,  which  means  that  the  client  that  is  on 
your  machine  will  be  able  to  interoperate  with  lots  of  different  systems  on  the  network 
regardless  of  who  you  bought  it  from  or  what  its  original  intention  was. 

What  capabilities  do  vou  gain  bv  using  WAIS  instead  of  Z39.50?  What  capabilities 
do  you  lose?  Once  again,  I  think  this  is  an  apples  and  oranges  questions.  Since  we  are 
discussing  a  protocol  and  the  different  versions  of  a  protocol  versus  an  application  which 
makes  use  of  that  protocol,  it's  not  something  that  can  be  compared  in  the  context  of  this 
question. 
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What  are  the  relationships  among  Z39.50.  WAIS.  WWW.  Gopher?    How  does 
Gopher  use  WAIS?  How  does  Gopher  use  Z39.50?  Z39.50  is  a  protocol;  WAIS  is  a 
system;  Worldwide  Web  is  a  system;  Gopher  is  a  system.  None  of  the  three  systems  use 
the  same  protocol,  but  they  all  interoperate  through  gateways.  You  can  generally  make 
systems  interoperate  through  gateways  without  a  tremendous  amount  of  difficulty. 
Sometimes  you  lose  some  functionality  depending  on  how  well  the  gateway  is 
constructed,  but  typically  you  can  make  systems  interoperate. 

That  answers  how  Gopher  uses  WAIS.  Gopher  uses  WAIS  through  a  gateway. 
The  Gopher  client  connects  to  a  gateway  which  speaks  Gopherese  and  this  gateway 
converts  the  query  issued  by  the  Gopher  client  into  something  which  can  be  understood 
by  the  WAIS  server.  Essentially,  it  is  a  Gopher  server  connected  to  a  WAIS  client,  some 
translation  takes  place  in  the  middle,  the  query  is  sent,  and  answer  is  returned,  more 
translation  takes  place,  and  a  response  is  given  to  the  original  query. 

Gopher  and  Z39,50  work  the  same  way  although  the  gateways  are  not  as 
commonly  used.  Since  Gopher  is  used  frequently  in  Campus-  Wide  Information  Systems, 
there  is  a  strong  desire  to  make  Gopher  chents  capable  of  accessing  Z39.50-based 
systems  that  are  used  in  libraries.  People  want  to  be  able  to  get  to  card  catalogs,  people 
want  to  be  able  get  to  non-bibliographic  information  stored  under  Z39.50  systems.  So 
gateways  do  work.  I  don't  know  where  it  is  but  I  do  know  that  a  Gopher  client  hke  the 
one  described  exists. 

How  and  when  will  these  various  tools  be  harmonized?  Harmonized  is  an 
interesting  word  because  there  are  lots  of  ways  to  harmonize  tools.  You  can  either 
create  an  application  that  talks  to  lots  of  the  different  apphcations  through  the  use  of 
different  protocols;  one  example  of  that  is  NCSA  Mosaic.  You  get  one  user  interface 
that  lets  you  operate  natively  with  lots  of  different  tools  because  it  is  essentially  a 
collection  of  clients  in  one  package  sitting  on  your  desktop  that  goes  out  and  talks  to 
different  systems.  The  advantage  of  that  is  everything  looks  the  same,  you  have  one  tool 
sitting  on  your  desk  and  you  know  how  to  work  it  for  all  of  the  different  system.  The 
disadvantage  of  this  is  that  you  have  a  really  big  client  and  if  one  thing  changes,  then  you 
have  to  go  and  change  your  application  and  redistribute  it  to  everyone,  you  can't  just 
change  and  redistribute  parts.  Another  way  to  do  this  is  through  a  gateway.  The 
advantage  of  a  gateway  is  that  if  something  changes  only  the  gateway  must  be  changed. 
The  disadvantage  is  that  you  can  lose  something  with  a  gateway,  A  gateway  does  not 
always  represent  the  richness  of  the  data  that  can  be  accessed  using  the  native 
application. 

How  do  vou  use  Z39.50  and/or  WAIS  to  discover  Z39.5Q  WAIS  servers?  Those 
who  have  used  WAIS  know  that  there  is  a  directory  of  servers  which  can  be  queried. 
The  server  returns  a  Ust  of  information  sources  or  servers  which  seem  relevant  to  the 
query,  and  then  those  information  resources  can  be  accessed.  Systems  that  are  based  on 
Z39.50  are  applications,  and  the  way  that  you  find  resources  on  the  network  is 
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applications  bound-it  depends  on  how  your  application  wants  to  allow  it.  If  you  were  to 
put  a  new  protocol  engine  into  WAIS,  the  system  for  finding  resources  would  not  change 
because  it  is  an  application-bound  process  as  opposed  to  a  protocol-bound  process.  It  is 
a  question  of  what  information  you  send  between  the  client  and  the  server  which  allows 
you  to  locate  resources. 

Can  Z39.50  be  used  to  build  a  server  navigation  tool  to  discover  and  navigate 
among  resources?  Yes.  You  can  use  Z39.50  to  build  a  server  navigation  tool  because 
the  important  thing  is  the  information  that  you  are  exchanging  and  you  can  use  the 
Z39.50  protocol  to  exchange  all  of  the  information  you  need  to  be  able  to  locate 
resources  on  the  network.  Once  again,  it  is  an  application  problem. 

What  are  the  hardware/software  components  that  make  up  a  WAIS/Z39.50 
system?  You  need  at  least  one  computer;  to  extend  that,  you  need  two  computers  and  a 
network.  Computers  have  gotten  really  cheap.  You  can  go  out  and  buy  an  incredible 
workstation  for  a  small  amount  of  money  which  will  allow  you  to  provide  access  to  lots 
of  information.  Clients  for  systems  which  are  based  on  Z39.50  including  WAIS  are 
available  on  the  network  for  PCs  miming  Microsoft  Windows,  DOS,  or  any  number  of 
different  operating  systems,  all  you  have  to  do  is  get  them  and  use  them. 

What  development  tools  are  available?  Most  of  these  applications  are  written  in 
C,  and  you  need  to  be  a  C  programmer  to  be  able  to  develop  applications. 

How  are  WAIS  and  Z39.50  integrated?   Z39.50  is  the  protocol  stack  for  WAIS. 

BOB  WALDSTEIN:  Z39.50  IMPLEMENTATION(S) 

13.  Who  currently  has  implementations,  production  or  otherwise,  of  Z39.50? 

14.  What  are  the  various  Z39.50  platforms?  For  origins?  For  targets? 

15.  What' programming  languages  and  other  tools  are  used  for  building  Z39.50 
implementations? 

16.  When  are  non-bibUographic  implementations  likely  to  be  available? 

17.  What  are  "self-describing"  targets?  Origins?  How  does  an  origin  learn  the  details 
of  what  a  target  supports? 

18.  Could  Z39.50  be  used  for  a  CWIS  system?  What  enhancements  are  needed  (and 
being  developed)  in  this  respect? 

19.  How  do  you  build  cHents  that  access  both  WAIS  servers  and  ordinary  Z39.50 
servers?  How  do  you  build  servers  that  can  be  accessed  by  WAIS  clients  and 
ordinary  Z39.50  clients? 
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The  implementation  I  know  best  is  mine.  I  have  between  80  and  100  databases 
my  users  use;  we  have  about  50  to  60  databases  which  are  internal  such  as  the 
circulation  database.  We  have  between  2,000  and  3,000  users  of  the  database,  with  a 
potential  population  of  100,000;  this  is  an  internal  system  and  not  for  outside  use. 

Commercial  implementations  of  Z39.50  version  2  1992  are  available  for  sale  from 
OCLC  and  RLG.  People  are  already  contracting  with  them.  Penn  State  says  that  it  has 
70,000  people  using  RLG's  system  through  a  server  and  it  is  working  perfectly.  People 
don't  know  that  Z39.50  is  there  and  everyone  is  very  happy 
with  it. 

From  my  perspective,  it  is  very  important  that  all  of  the  library  vendors  are  buying 
into  Z39,50.  When  you  go  to  your  local  catalog  in  your  local  library,  technically  all  of 
those  catalogs  will  be  able  to  link  to  every  other  catalog  everywhere  else.  All  of  the  big 
vendors  I  know  about  are  paying  attention. 

The  other  thing  that  is  important  to  me  is  that  all  of  the  major  information 
vendors  are  paying  attention  to  Z39.50.  Mead  has  stated  a  date  when  they  expect  to 
have  a  Z39.50  system  available.  Chemical  Abstracts  and  BRS  are  talking  about  systems. 
Dialog  appeared  at  the  last  ZIG  meeting.  All  of  the  major  information  providers  that  I 
want  access  to  are  starting  to  appear  at  the  Z39.50  meetings. 

My  stuff  runs  under  UNIX.  The  nice  thing  about  platforms  in  terms  of  targets  is 
that  I  don't  care.  RLG  is  running  under  whatever  RLG  is  running  under;  I  talk  Z39.50 
and  it  works  fine.  You  only  care  about  the  platform  if  you  are  buying  something. 

In  terms  of  origins,  I  plan  to  get  all  of  the  free  code.  My  stuff  all  works  under 
UNIX  systems.  The  other  thing  interesting  in  my  world  is  that  I  am  the  only  application 
I  know  running  over  DataKit  and  not  over  TCP,  That  is  important  to  me  because  it 
means  that  my  modems  are  hooked  to  DataKit  which  means  I'll  be  hooked  to  modems 
probably  within  the  next  month  or  less  and  I  won't  have  to  rip  apart  source  code  and 
replace  modems. 

All  of  my  stuff  is  written  in  C. 

Most  of  my  databases  are  non-bibhographic.  A  few  are  bibUographic,  but  most 
are  non-bibliographic.  Chemical  Abstracts  is  going  to  do  all  kinds  of  slick  stuff  that  is 
non-bibliographic. 

My  clients  are  totally  stupid.  My  cUents  do  not  know  MARC.  If  you  connect  with 
my  chent  and  it  connects  to  someone  and  if  the  only  thing  that  person  delivers  is  MARC, 
the  only  thing  my  client  knows  to  do  is  take  it  apart  and  present  the  001  field  followed 
by  the  002  field  complete  with  all  the  subtags  and  it  looks  awful.  And  the  vendors  have 
complained  to  me  that  my  users  use  my  cHent  and  it  accesses  them  and  it  looks  awful.  I 
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have  now  fixed  that.  My  dients  now  leam  through  a  thing  called  "Explain"  how  a 
database  is  structured  and  presented,  and  now  everything  is  starting  to  look  nice  with 
pop-ups  that  tell  the  users  what  is  needed. 

Dartmouth  is  doing  what  sounds  to  me  hke  the  most  complete  gateway  system 
that  I've  heard  of:  they  have  their  own  protocol  from  their  MAC  cHents  to  a  gateway 
and  that  gateway  talks  every  protocol  that  I  know  of--Z39.50-1992,  WAIS,  Gopher,  and 
all  the  others. 

John  Kunze  of  the  University  of  California,  Berkeley,  is  the  source  of  the  only  pubUc 
domain  CWIS  using  Z39.50  version  2.  There  is  also  Project  Mercury  at  Carnegie-Mellon 
which  is  Z39.50  based. 


LES  WIBBERLEY:  INFORMATION  RETRIEVAL 

20.  What  are  the  Bib-1  and  Info-1  attribute  sets? 

21.  Are  there  attribute  sets  under  development  for  non-bibliographic  applications? 

22.  Indexing  is  not  implemented  in  a  standard  manner  across  systems;  does  Z39.50 
address  this? 

23.  Does  Z39.50  only  retrieve  MARC  records?  What  if  I  have  other  types  of  records 
to  retrieve? 

24.  What  is  a  "record  syntax"?  "element  set"? 

25.  How  do  WAIS  and/or  Z39.50  handle  text  records,  and  the  different  ways  they  are 
represented  (ASCII  vs.  EBCDIC,  newline  vs.  carriage  return)?  How  does  Z30.50 
differ  in  this  respect? 

26.  What  types  of  data  can  WAIS  index  and  retrieve?  Z39.50?  How  is  data  prepared 
for  retrieval  by  WAIS?  How  does  this  differ  from  Z39.50?  Are  there 
fundamentally  different  premises  of  Z39.50  and  WAIS  in  this  regard? 

27.  If  information  is  searchable/retrievable  by  WAIS,  can  a  Z39.50  origin  also  access 
it? 

Within  Z39.50  there  are  some  standard  query  formats  that  say  how  to  express  a 
question.  One  of  them  is  called  the  "reverse  PoKsh  notation"  query.  Within  that  query 
structure,  search  terms  are  identified  by  something  that  is  called  an  "attribute."  So  when 
you  hear  the  term  attribute,  an  attribute  indicates  what  kind  of  information  is  being 
included  in  a  search  term.  Is  it  an  author,  is  it  a  title,  how  do  you  want  to  use  this  in 
searching?  Z39.50  includes  a  built-in  attribute  set  called  bib-1  and  that  is  oriented 
towards  bibliographic  searching,  but  the  protocol  is  modular  in  that  different  conventions 
can  be  plugged  in.  A  different  attribute,  for  example,  for  scientific  and  technical 
searching  can  be  plugged  into  the  protocol  modularly.  So  that  is  what  the  concept  of  an 
attribute  is.  There  are  many  kinds  of  attributes:  one  is  to  tell  how  the  term  is  used- 
others  indicate  relation;  the  structure  of  the  search  term,  is  it  a  word  or  phrase;  is  it 
truncated.  So  it  allows  you  to  very  explicitly  indicate  your  search  terms  and  indicate 
what  kind  of  terms  they  are. 
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Are  there  attribute  sets  under  development?  Yes,  there  are  and  what  I'm  going 
to  show  you  is  an  example  of  another  non-  bibUographic  attribute  set.  This  one  is  called 
the  scientific  and  technical  attribute  set.  It  is  a  listing  of  search  terms  within  databases 
which  carry  scientific  and  technical  information.  For  example,  there  are  patent  numbers, 
patent  application  dates,  molecular  formulas,  boiling  points,  chemical  names,  as  well  as 
the  traditional  abstract  title. 

Does  Z39.50  only  retrieve  MARC  records?  (Question  23)  No,  there  is  a  concept 
within  Z39.50  called  record  syntax  and  when  you  request  to  retrieve  information  via 
Z39.50  you  can  request  how  that  information  is  packaged.  In  brief  terms,  the  record 
syntax  is  how  do  you  package  the  information.  One  packaging  for  bibliographic 
information  is  called  a  MARC  record.  Another  type  of  packaging  is  called  generic 
record  syntax  which  allows  you  to  tag  individual  fields  with  very  detailed  information 
about  whether  it  is  gif,  or  tif,  or  whatever.  So  there  is  a  lot  of  flexibiUty  which  can  be 
plugged  into  the  protocol  for  delivering  a  variety  of  different  types  of  data. 

How  do  WAIS  and/or  Z39.50  handle  text  records,  and  the  different  ways  they  are 
represented  (ASCII  vs.  EBCDIC,  newhne  vs.  carriage  return)?  How  does  Z3Q.50  differ  in 
this  respect?  (Question  25)  How  Z39.50  handles  text  records  is  fairly  straight  forward; 
it  is  one  particular  record  syntax.  There  is  a  record  syntax  which  we  are  in  the  process 
of  standardizing  right  now  which  says  that  if  you  want  to  send  text  data,  you  send  it  in 
this  way.  That  will  be  part  of  the  standard  in  the  appendix. 

RALPH  LE  VAN: 

One  of  the  issues  is  the  question  of  whether  there  are  commercial  Z39.50 
products?    OCLC  has  been  selling  access  to  its  databases  via  Z39.50  for  sometime  now. 
We  are  also  selling  a  UNIX-  based  system  that  includes  a  database  engine,  a  Z39.50 
server,  that  server  acts  as  a  gateway  so  you  can  use  it  to  access  your  own  databases  or 
get  to  other  people's  Z39.50  servers,  and  it  includes  a  client.  So  there  are  people  trying 
to  make  a  business  out  of  Z39.50  applications  right  now. 

MISCELLANEOUS: 

These  questions  were  not  addressed  due  to  limited  time. 

28.  Can  a  Z39.50  origin  search  multiple  targets  during  a  single  search? 

29.  How  do  you  access  a  particular  Z39.50  application  on  a  remote  server,  via  TCP? 

30.  Is  there  an  RFC  for  Z39.50?  WAIS? 

31.  Is  WAIS  being  implemented  internationally? 

32.  What  is  a  doc-id  (WAIS)?  A  URL?  Is  anyone  using  URLs  in  Z39.50  or  WAIS? 

33.  What  is  Z39.58  and  how  does  it  relate  to  Z39.50? 

34.  What  security  measures  are  provided  by  Z39.50? 

35.  How  does  Z39.50  apply  to  special  characters  and  foreign  languages?  Unicode? 
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INTERNET  MULTICASTING  SERVICE 


Carl  Malamud 

I  plan  to  do  two  things:  I'm  going  to  tell  you  a  little  a  bit  about  the  Internet 
Multicasting  Service,  and  then  I'm  going  to  compare  WAIS  to  a  service  which  at  a  first 
glance  looks  totally  different  and  show  you  why  they  are  actually  doing  the  same  thing 
and  why  I  think  both  of  them  will  work.  And  maybe  we  can  get  some  lessons  out  of  that 
for  some  of  the  projects  that  you  are  working  on. 

The  Internet  Multicasting  Service  is  a  non-profit  located  in  the  National  Press 
Building  along  with  all  of  the  other  press  organizations.  I'm  right  next  to  the  Kansas 
City  Star.  We're  a  little  bit  like  the  PubUc  Broadcasting  System  although  they  are  a  lot 
bigger.  You  are  looking  at  the  entire  full-time  staff.  We  have  a  few  part-time  stringers, 
contractors,  and  a  system  administrator.  We  like  to  think  of  ourselves  as  "the  flame  of 
the  Internet."  We  are  a  press  organization;  we  publish  information  on  the  network.  We 
publish  about  200  megabytes  per  week  of  data.  We  are  subsidized  by  people  in  industry 
and  foundations;  we  don't  have  any  disk  drive  manufacturers  as  a  sponsor  yet  but  I'm 
confident  that  we'll  get  some  soon. 

We  run  two  channels:  Internet  Talk  Radio  and  Internet  Town  Hall.  Both  of 
those  are  regular  sources  of  data  on  the  network.  A  lot  of  that  data  is  audio  files. 
Internet  Talk  Radio  started  as  strictly  an  audio  metaphor.  What  we  do  is  publish 
standard  audio.  We  take  shows  from  National  Public  Radio  which  we  syndicate  just  like 
any  other  group  would  syndicate,  and  we  put  those  files  on  the  network.  Internet  Town 
Hall  is  related  to  Internet  Talk  Radio;  it's  also  a  source  of  data  on  the  network.  It's 
general  public  affairs  programming,  and  we  do  a  variety  of  special  events.  I'll  talk  more 
about  a  couple  of  those. 

Internet  Talk  Radio  is  my  reaction  to  journals  like  "ComputerWorld,"  "InfoWorld," 
and  the  other  groups  in  the  trade  press  which  provide  zero  information  in  my  opinion.  If 
you  pick  of  "InfoWorld"  it  might  tell  you  that  a  new  product  was  introduced,  but  it 
doesn't  tell  you  if  it  works,  if  it's  useful,  or  what  the  underlying  technology  is.  You  might 
find  an  article  on  WAIS  in  one  of  those  publications,  but  you  are  unlikely  to  find  an 
article  describing  what  Z39.50  is,  let  alone  one  that  describes  the  subtle  variations.  We 
are  trying  to  address  that  gap.  Our  audience  is  primarily  engineers;  this  is  highly 
technical  programming.  Think  of  it  as  MacNeil-Lehrer  meets  Bill  Joy.  Bill  Joy  is  the 
founder  of  Sun  Microsystems  and  its  punk-rock  engineer. 

This  is  very  technical  information.  We  are  doing  an  interview  this  week  on  "Geek 
of  the  Week,"  our  flagship  show,  with  Stewart  Vance  of  TGB  Software  on  different  ways 
of  doing  IP  encapsulation  and  how  that  relates  to  the  next  generation  of  work  for  the 
Internet  protocol.  Good  if  you  like  it;  otherwise  it's  just  a  lot  of  extra  data. 
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How  do  we  do  this?  We're  just  like  any  other  radio  station  except  we  don't  use 
airwaves.  We  use  a  fiber  optic  link  to  the  Internet.  UUNET  is  our  main  service 
provider  We  have  a  10  million  bit  per  second  link  to  UUNET.  I  have  mostly  PCs  and 
a  SPARCStation  in  my  facilities.  I  get  better  performance  talking  to  the  outside  world 
than  I  do  on  my  internal  LAN  right  now  because  the  PC  implementations  aren  t  good 
but  when  my  SPARCStation  talks  to  another  SPARCStation  it  runs  very  quickly. 

We  have  a  variety  of  professional  broadcasting  equipment,  digital  effects 
processors  and  DAT  decks  and  digital  editing  studios,  which  we  use  to  produce  radio. 
We  capture  the  radio  show  as  a  digital  file  at  48  kilohertz;  a  half  hour  of  that  is  several 
hundred  megabytes  of  data.  We  sample  that  down  to  8  kilohertz,  which  is  the  equivalent 
of  what  a  good  phone  line  sounds  like.  It's  not  stereo  quality  but  it's  good  enough  for 
radio. 

What  most  of  my  listeners  do  is  listen  to  our  show  while  they  are  doing  something 
else  in  the  morning.  If  a  phone  call  comes  in,  they  put  the  show  on  hold  because  it  s  not 
really  radio,  they  answer  the  phone  call,  and  restart  it  after  the  phone  call. 

We  do  a  lot  of  field  recording.  Our  studios  are  just  getting  set  up.  We  are  going 
to  begin  doing  live  video  onto  the  Internet  by  September  or  October.  Most  of  what  we 
record  right  now  consists  of  going  to  a  meeting  such  as  this  one,  interviewing  someone 
like  Brewster  Kahle,  and  taking  that  interview  home  where  we  edit  it  and  clean  up  tne 
sound. 

Basically,  what  we  do  is  publish  files.  Some  of  it's  text,  some  of  it's  audio.  We'll 
be  looking  at  st^ctured  databases.  We  are  also  looking  at  and  will  be  making  an 
announcement  in  a  few  weeks  about  making  some  sources  of  government  data  available 
which  are  not  currently  available. 

Once  we've  processed  tlie  files  for  Internet  publication,  we  take  those  files  and 
move  them  to  UUNET,  the  Mother  of  all  FTP  servers. .  They  make  about  6  gigabytes  of 
storage  available  to  us.  UUNET  gives  the  data  to  IIJ,  which  is  the  Japanese  network 
orovider  and  to  EUNET,  which  is  a  commercial  provider  in  Europe.  From  there  it  goes 
to  secondaries:  NASA,  for  example,  gets  its  feed  from  UUNET,  as  does  Energy  Science 
and  ANS.  ANS  in  turn  gives  it  to  other  networks.  From  here  it  goes  around  the  world. 

I  don't  maintain  a  distribution  network.  AU  I  do  is  put  the  data  in  one  place.  It 
reaches  30  countries.  It  has  a  listenership  of  100,000  who  have  taken  the  time  to  learn 
how  to  download  30  megabyte  files  and  play  them  on  their  SPARC  or  MAC  or  PC. 

The  second  channel  is  the  Internet  Town  Hall,  a  public  affairs  channel.  If  you 
listen  to  the  National  Press  Club  luncheons  these  days,  the  President  introduces  guests 
which  might  include  Senators,  Cabinet  members,  and  recently  the  Dalai  Lama.  They 
welcome  their  members,  the  guests  at  the  luncheon,  their  viewers  on  C-Span,  and  they 
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welcome  their  listeners  on  National  Public  Radio  and  the  global  Internet  computer 
network  We  are  one  of  the  three  networks  which  take  those  luncheons  and  put  them  on 
the  network.  A  lot  of  those  are  available  as  archives  now.  There  are  a  lot  of  servers 
keeping  that  data.  It's  possible  to  do  WAIS  searches  of  the  README  files  and  find 
that  the  Dalai  Lama  appeared  on  Internet  Town  Hall  on  a  particular  day  and  pull  the 
file. 

A  30  megabyte  file  sounds  like  a  lot,  but  once  you  get  to  your  Ethernet  which  is 
10  million  bits  per  second  it's  not  a  lot.  A  sound  file  is  64,000  bits  per  second  so  you  re 
using  a  small  portion  of  your  Ethernet  or  your  FDDI  ring.  There  are  a  lot  of 
corporations  such  as  Sun  Microsystems  (which  runs  a  radio  station)  which  take  the  files, 
queue  them,  and  multicast  them  over  their  net  on  a  regular  basis.  It  s  not  that  big  of  a 
bandwidth  hog. 

We  do  a  variety  of  special  events  when  we  have  the  budget,  time,  and  inclination. 
What  we  are  trying  to  do  with  these  special  events  is  demonstrate  what  the  Internet  is 
and  how  it  can  be  used.  We're  trying  to  do  that  for  new  audiences.  We're  trying  to  talk 
to  congressional  officials,  or  national  public  radio  stations,  or  to  children,  and  tell  them 
about  this  new  tool,  the  Internet. 

We  also  try  to  push  the  technology  a  bit  on  some  of  these  special  events.  The 
Internet  is  fairly  robust  but  when  it  comes  to  moving  audio  or  video  real  time  over  the 
network,  it  doesn't  quite  work  yet.  Some  think  if  it  doesn't  work  yet,  we  shouldn  t  do  it. 
But  if  vou  look  at  the  first  radio  network  which  was  done  by  the  National  Broadcasting 
Corporation,  there  were  400  engineers  from  AT&T  and  NBC  standing  by  to  make  that 
network  happen.  So  the  way  to  learn  to  make  this  technology  go  real  time  on  a  frequent 
basis  is  to  bring  in  the  engineers  and  have  them  hack  it  together  the  first  time.  So  a  lot 
of  our  special  events  try  to  provide  a  stage  for  people  from  XEROX  Pare  and  other 
similar  research  institutes  to  come  in  and  do  a  little  playing. 

A  couple  of  examples:  A  couple  of  months  ago  we  did  the  Global  Schoolhouse 
Project  on  behalf  of  the  National  Science  Foundation.  Over  thirty  organizations  got 
together  The  idea  was  very  simple:  there  were  kids  sitting  there  with  their  MACs; 
Apple  was  very  generous  and  donated  machines  to  four  schools-one  in  Califorma,  one  m 
Tennessee,  one  in  Virginia,  one  in  England.  The  children  learned  about  the  Internet, 
sent  mail  to  each  other,  did  research  on  the  environment,  read  Al  Gore  s  book,  did 
original  research  in  their  communities  on  what  could  be  done  about  the  environment. 
Then  they  did  a  video  conference  over  the  network.  They  were  able  to  brief  White 
House  and  NASA  officials  on  what  they  thought  the  governmem  ought  to  be  doing.  We 
view  that  as  a  prototype  of  what  an  Internet  Town  Hall  might  be:  a  group  of  citizens,  a 
group  of  leaders,  an  issue,  people  go  off  and  do  their  research,  and  they  have  a  dialogue. 
This  is  not  a  one-shot  deal,  where  someone  asks  the  President  the  first  question  that 
comes  to  mind  on  first  meeting  him.  Rather  the  citizens  use  the  network  ta  span  time 
and  do  some  in-depth  research,  the  Town  Hall  spans  time,  the  citizen  have  time  to  brief 
the  leaders,  and  the  leaders  to  brief  the  citizens,  and  a  real  dialogue  can  take  place. 


41 


A  fun  example  of  this  recently  which  involved  Brewster  Kahle  was  National  Public 
Radio  meets  the  Internet.  We  linked  the  Internet  to  National  Public  Radio's  "Talk  of 
the  Nation  Science  Friday"  for  an  hour  of  live  radio.  Ira  Plato  was  the  host  and  he  sat  in 
New  York  and  took  questions  from  the  Internet.  He  took  questions  two  ways.  One  is  I 
took  a  Radio  Mail  terminal  with  me  and  we  took  email  from  the  world  and 
demonstrated  that  with  one  thin  Une,  a  9.6  modem,  we  could  bring  in  300  questions  from 
the  outside  world  in  a  short  period  of  time.  When  you  do  interactive  events,  the  Umiting 
factor  is  how  people  can  participate.  What  we  were  trying  to  show  is  that  in  one  hour 
on  a  single  line,  300  people  got  their  comments  in.  So  rather  than  favoring  the  person 
who  has  the  speed  dial  button,  you  can  have  equal  access  as  this  is  a  much  fairer  way  of 
people  getting  their  questions  submitted.  We  also  demonstrated  audio  coming  in  from 
the  Internet.  Ira  was  the  perfect  host.  He  would  say,  "Let  me  get  this  straight.  Do  you 
have  a  telephone  there?"  "No,  I  don't  have  a  telephone."  "Well,  how  are  you  talking  to 
me?"  And  the  person  he  was  talking  to  would  respond,  "Well,  I'm  in  front  of  my 
computer;  I'm  talking  to  it." 

This  is  a  technology  that  we'd  like  to  see  the  radio  and  television  communities 
begin  to  adopt.  The  reason  is  that  they  are  the  professionals  at  producing  information. 
These  are  the  people  that  produce  information  for  a  living  and  I  think  that  it's  vital  that 
we  get  these  folks  on  the  Internet  because  a  lot  of  us  are  amateurs  at  producing 
information  or  professionals  at  producing  catalogs  of  information.  But  when  it  comes  to 
the  highly  produced  sources,  these  are  the  folks  that  we  need  to  get  on  the  network,  and 
not  just  for  email  but  for  programming. 

Brewster  asked  me  to  do  a  "blue  sky"  or  "what's  going  to  happen  tomorrow,"  but  I 
have  no  idea  what's  going  to  happen  so  let's  look  at  the  past  instead. 

WAIS  is  one  view  of  the  world  and  it  has  some  nice  attributes.  It's  decentralized, 
no  one  database  is  in  control  of  the  other  databases.  It's  also  distributed  in  that  there  are 
many  databases.  You  could  have  centralized  control  over  a  distributed  environment. 
WAIS  wisely  is  a  bunch  of  independent  databases  tied  together  in  a  transparent  fashion. 

In  my  view,  it  does  three  things.  It's  a  transport  mechanism,  Z39.50,  a  way  of 
allowing  a  client  to  talk  to  the  server.  It  is  a  referral  service  so  that  it  tells  you  that 
there's  another  server  on  the  other  side  of  the  world  which  has  something  you  might  be 
interested  in,  or,  in  WAIS  terminology,  that  there's  another  database,  another  ".src"  file. 
It's  also  a  language  for  talking  to  those  servers.  So  it  does  three  things  and  it  does  them 
very  simply.  You  don't  need  to  do  a  lot  of  user  training. 

It's  a  view  of  the  world  but  it's  one  of  several  views  of  the  world.  And,  as  we 
know,  it's  very  important  not  to  ignore  those  other  views.  When  we  look  at  Worldwide 
Web,  Gopher,  and  things  like  NetFind,  whois,  and  finger-all  directory  services  for  , 
finding  people--we  see  that  the  ones  that  work  all  have  the  same  attribute.  They  look 
easy  to  the  user.  Mosaic  is  a  very  easy  interface  to  the  Web.  The  reason  Gopher  works 
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is  because  it's  a  menu  system.  You  pick  any  item  and  you  get  another  menu.  I  can  train 
even  senior  managers  on  Gopher  in  five  minutes. 

So  the  ones  that  work  have  small,  easy  to  establish  nodes.  You  don't  have  to 
solve  the  entire  problem  in  order  to  get  up  and  running.  That's  crucial.  A  lot  of  you  are 
professionals  working  to  put  large  databases  together.  It's  very  tempting  to  say  I  m  gomg 
to  put  my  organization  online,  to  make  the  entire  organization  go  onlme  at  the  same 
time  To  me,  that's  a  recipe  for  disaster.  It's  much  easier  to  find  one  small  database, 
put  that  one  online.  Maybe  later,  we'll  have  problems  with  compatibihty  but  at  least 
we'll  have  something  done. 

WAIS  has  a  simple,  transparent  interface.  X.400  has  been  a  dismal  failure 
because  the  address  is  too  big  to  fit  on  a  business  card.  You  contact  an  X.400  user  by 
picking  up  the  telephone.  That's  because  the  X.400  address  is  too  long. 

Just  as  important  is  that  WAIS  is  built  over  today's  transport.  One  of  the 
geniuses  of  the  original  WAIS  team  is  that  they  looked  at  the  Z39.50  work  and  they  said 
that  this  is  very  important  but  it's  not  quite  ready  yet  and  we're  not  going  to  wait.  So 
they  ripped  out  parts  of  Z39.50  and  they  invented  the  WAIS  version  of  Z39.50  and  used 
that  That's  causing  them  some  problems  now.  Brewster  is  spending  a  lot  of  time  going 
through  and  adding  in  total  Z39.50  compatibility.  But  it's  up  and  running  and  that  s  why 
vou  are  all  here  today.  It's  tempting  to  say  that  this  won't  work  until  a  standard  is  ready. 
We  can  say  we've  got  to  have  X.500  because  without  a  global  directory  what  can  we  do. 
You  can  do  a  lot  of  things  that  might  not  scale  but  at  least  you  got  started. 

So  I'm  going  to  compare  this  to  a  totally  new  kind  of  service.  Those  of  you  who 
read  the  "New  York  Times"  a  couple  of  days  ago  or  the  "Washington  Post"  on  Monday 
might  have  seem  some  trouble  that  myself  and  a  colleague.  Dr.  Marshall  Rose,  are 
earning  We've  invented  a  new  domain.  Mydomainisradio.com.  We  have  a  new 
domain  called  tpc.int.  You  may  never  have  heard  of  the  international  domain;  there 
actually  is  another  organization  in  ..int  and  that  is  NATO.  And  so  NATO  and  our  team 
are  the  two  and  we  are  both  fairly  dangerous,  and  in  some  ways  I  m  more  dangerous 
than  NATO  these  days.  What  we're  doing  is  an  experiment  in  remote  printing  built  on 
top  of  the  electronic  mail  infrastructure  and  there  are  a  bunch  of  printer  gateways.  You 
will  say  why  would  I  want  remote  printing.  We  can  send  three  different  kinds  ot 
documents:  Postscript,  Ascii,  and  TIF.  For  purposes  of  our  printing  experiment,  a 
printer  is  any  G3  facsimile  device  anywhere  in  the  world.  What  this  lets  you  do  is  send 
electronic  mail  and  reach  any  fax  machine.  We  don't  have  the  whole  world  yet  but  we 
do  have  Japan,  Australia,  the  Netherlands,  Ireland,  Sweden,  and  a  lot  of  the  U.S. 
Organizations  like  Sun  Microsystems  now  are  serving  their  own  organizations.  What 
they  are  saying  at  Sun  is  that  I'm  not  going  to  spend  $.06  placing  a  local  call  to  two 
people  I  don't  know,  some  sender  on  email  and  some  recipient  on  a  fax  machine,  but  I 
am  willing  to  do  it  within  my  own  PBX.  So  the  user  sends  mail  to  phone-number.tpc.mt. 
They  don't  know  about  remote  printers;  all  they  know  about  is  the  target  they  are  trying 
to  reach.  You  might  call  this  global  bypass.  We  prefer  not  to  use  that  term. 
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So  how  are  WAIS  and  tpc.int  similar?  They  are  both  built  bottom-up;  they  are 
both  decentralized.  We're  not  telling  people  how  to  operate  their  gateways.  We're  just 
saying  that  if  people  have  a  message  and  they  can  image  it  on  their  fax  machine,  do  it. 
There  are  many  different  models  of  remote  printer  operation.  Some  like  Sun  will  do  it 
within  their  organization.  They  have  a  salesman  who  won't  use  email.  Our  service  lets 
customers  on  the  Internet  send  him  a  fax.  NASA  Ames  Research  Center  is  using  it  as  a 
way  of  reaching  some  of  their  people  who  are  not  on  email  yet.  Others  are  running 
gateways  as  a  public  citizen  type  of  enterprise,  as  a  way  of  reaching  out  to  a 
neighborhood,  so  they  are  running  neighborhood  gateways.  Other  folks  are  looking  at 
this  and  saying  that  they  can  make  money  with  it.  We  are  allowing  gateways  as  part  of 
our  specification  to  use  one-third  of  the  cover  sheet  and  lease  it  as  advertising.  Junk 
fax?  No,  by  letting  a  third  party  pay  for  that  service  it  makes  it  feasible. 

So,  what  are  the  lessons?  Grass  roots  is  good.  If  you  are  trying  to  get  your 
organization  up,  it  is  tempting  to  look  at  an  organization  which  has  compatibility 
problems.  I've  found  that  it's  a  lot  easier  to  clean  up  after  the  fact  than  to  plan 
something  beforehand.  So,  my  message  is  to  avoid  central  poUcy.  You  want  standards, 
but  avoid  the  temptation  to  do  just  that.  Show  by  example  and  not  by  memo. 

If  you  want  to  learn  more  about  the  remote  printing  experiment,  send  an  email  to 
tpc-faq@town.hall.org. 

Other  Internet  Multicasting  Service  Resources 

MBONE:  isi.edu: /mbone/faq.txt 

Audio:  ftp.cwi.nl:/pub/audio/ 

CU-SeeMe:  gated.cornell.edu:/pub/video/ 

PARC:  parcftp.xerox.com:/pub/net-research/ 

LBL:  ftp.ee.lblgov:/ 

ITR  Sites:  sites@radio.com 

ITR  Info:  info@radio.com 
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EISENHOWER  NATIONAL  CLEARING  HOUSE 
FOR  MATHEMATICS  AND  SCIENCE  EDUCATION 

Len  Simutis,  Director 

Nancy  O'Hanlon,  Associate  Director  for  Library  and  Information  Systems 


LEN  SIMUTIS: 

I  am  pleased  to  have  the  chance  to  speak  with  you  today  about  the  Eisenhower 
National  Clearing  House  for  Mathematics  and  Science  Education.  We  were  established 
at  the  Ohio  State  University  last  October  with  funding  from  the  Department  of 
Education's  Office  of  Educational  Research  and  Improvement.  Our  mission  is  to  get 
useful  and  effective  K-12  mathematics  and  science  curriculum  materials,  which  are  tied 
to  systemic  reform  efforts  nationally,  into  the  hands  or  at  the  fingertips  of  teachers  and 
students. 

The  federal  government  supports  an  incredible  amount  of  development  of 
curriculum  materials  and  programs,  and,  for  whatever  reasons,  that  material  is  not 
getting  into  the  hands  of  teachers  and  students.  So  our  role  is  to  identify,  gather, 
catalog,  and  distribute  curriculum  materials  and  programs,  the  materials  themselves.  We 
are  doing  that  by  creating  a  physical  repository  of  the  materials  in  Columbus,  Ohio,  and 
an  electronic  repository  as  well  to  distribute  those  materials  in  both  traditional  print 
formats  as  well  as  in  multimedia  formats,  on  demand  in  print  format,  on  demand  in  fax 
format,  in  CD-ROM  format  beginning  in  1995,  and  via  the  Internet. 

We  are  funded  through  the  U.S.  Department  of  Education  and  we  are  expected 
and  will  work  very  closely  with  the  department  as  it  puts  together  its  nationwide 
networks  of  Inet  and  Smartline.  We  are  working  as  well  other  federal  agencies.  While 
we  are  funded  by  the  Department  of  Education,  we  are  expected  to  work  with  all  federal 
agencies  which  are  involved  either  directly  or  indirectly  in  the  development  of  materials 
which  could  be  useful  in  K-12  math  and  science  education.  We  are  working  most  closely 
with  a  group  called  the  Federal  Coordinating  Council  for  Science,  Engineering,  and 
Technology  with  the  wonderful  acronym  FCCSET.  This  is  an  inter-agency  effort  to 
identify  and  distribute  programs  which  support  mathematics,  science,  and  engineering 
education.  We  are  working  as  well  with  the  regional  education  laboratories  established 
some  fifteen  to  twenty  years  ago  across  the  country,  with  the  new  group  of  Eisenhower 
Regional  Consortia  which  are  ten  organizations  distributed  regionally  working  with  us  to 
support  mathematics  and  science  education,  with  other  database  providers,  and  with 
commercial  publishers.  ^     ,        .  _ 

We  beUeve  that  we  have  to  build  this  system  around  the  key  elements  of 
interoperability  and  standards,  that  we  will  be  a  key  source  of  information  about  K-12 
math  and  science,  but  that  we  will  not  be  the  only  source..  There  will  be  information 
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available  from  other  database  providers  and  from  commercial  publishers.  Scholastic  has 
amiounced  a  network  in  the  last  couple  of  weeks  for  teachers.  The  National  Education 
Association  has  announced  that  it  will  be  developing  a  network  via  America  Online  to 
support  teachers.  A  whole  host  of  people  are  and  should  be  supporting  math  and 
science  reform.  We  need  to  develop  a  framework  which  will  reach  all  those  people. 

So  one  of  the  design  elements  of  this  project  is  to  begin  with  the  notion  of 
interoperabiUty  to  create  what  we  are  callmg  a  federated  database.  We  have  offered  and 
we  are  seeking  additional  support  to  allow  us  to  bring  together  participants  or  groups 
which  are  representing  math  and  science  education  providers  to  develop  a  framework 
within  which  we  can  develop  a  way  of  exchanging  injformation  via  protocols  and 
standards.  We  are  beginning  with  a  database  architecture  built  on  the  Z39.50-199x 
standard.  Our  RFP  said  Z39.50-1992  but  we  plan  to  stay  as  current  as  possible  in  the 
implementation  of  that  standard.  We  need  to  work  toward  the  adoption  and  use  of 
common  thesauri  and  descriptors.  Nancy  O'Hanlon  will  address  this  in  more  detail.  We 
hope  that  will  be  helpful  to  you  as  you  see  how  we  are  trying  to  approach  these 
problems.  We  also  need  to  look  closely  at  document  and  media  interchange  formats 
which  have  been  referred  to  frequently  today. 

We  will  be  creating  a  catalogue  of  curriculum  materials  and  programs.  We  will 
be  appending  to  those  catalogue  entries  evaluations  of  the  materials,  both  systematically 
collected  evaluations  and  anecdotal  evaluations  by  teachers  and  others.  We  will  be 
creating  text  files  to  the  degree  we  are  able  to  obtain  rights  to  the  redistribution  of  the 
materials  developed  under  federal  support  or  others,  we  want  to  be  able  to  make  those 
files  available.  We  will  be  cataloging  and  distributing,  again  with  the  appropriate  rights, 
computer  programs,  image  files,  and  eventually  videos.  By  the  third  year  of  our  work, 
early  in  1995,  a  subset  of  the  materials  we  are  collecting  will  be  distributed  in  CD-ROM 
nationally.  We  are  creating  as  well,  with  the  help  of  Aspen  Systems  of  Rockville, 
Maryland,  which  is  taking  the  lead  in  collecting  the  information,  a  directory  of  federal 
agency  programs  which  support  math  and  science.  We  think  this  will  be  a  very  helpful 
publication  which  will  be  distributed  regionally;  it  will  also  be  a  database  which  will  be 
accessible  nationally. 

One  of  the  benefits  in  starting  an  organization  like  ours  in  the  1990s  is  that  we 
can  start  fresh  without  any  legacies,  but  one  of  the  difficulties  then  is  identifying  the 
starting  point.  The  transitions  of  technology  are  often  difficult  for  people.  I  attended  a 
retirement  dinner  for  a  library  director  recently.  At  that  dinner  they  gathered  all  the 
technologies  that  that  director  had  seen  over  the  last  thirty  years  starting  with  punched 
cards  and  ending  with  the  advanced  workstations  of  today.  When  I  got  to  the  reception 
line,  I  said  "George,  you  must  have  seen  a  lot  of  changes  over  the  years.  Look  at  all  that 
has  happened  here."  And  he  said,  "Len,  I  sure  have  seen  a  lot  of  changes  and  I've  been 
opposed  to  every  damn  one  of  them."  We  don't  have  to  make  a  transition  from  one 
system  to  another  but  we  know  that  we  are  starting  something  that  has  to  be  in  place  for 
a  long  time.  So  we  have  to  be  very  careful  about  what  we  put  in  place,  and  we  want  to 
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do  so  in  a  way  that  keeps  interoperability  standards  and  protocols  paramount  in  what  we 
are  doing. 

I  am  going  to  put  up  a  couple  of  noisy  diagrams,  copies  of  which  are  in  the 
handouts  available  to  you,  not  so  we  have  to  look  at  all  the  details  but  to  show  you  our 
design.  I  attended  the  first  SIGWAIS  which  was  held  at  USGS  and  came  back  with  a 
number  of  handouts  which  I  gave  to  people  I  work  with  who  said  "Gee,  that  looks  just 
like  our  diagram."  And  that  was  very  reassuring  that  we  were  not  coming  up  with  a 
design  which  was  inconsistent  with  what  other  people  were  proposing.  Basically,  at  the 
top  is  a  set  of  chents,  a  MAC  client,  a  Windows  client,  a  terminal  emulation  client  which 
will  attach  somehow  to  the  Internet  that  will  then  connect  to  clearinghouse  systems 
which  are  built  around  Z39.50  servers.  The  servers  will  provide  access  to  a  variety  of 
databases,  some  of  which  may  be  text,  image,  etc. 

On  the  side  of  the  diagram  are  related  databases  from  other  database  providers. 
The  acronym  we  are  using  is  CAMSED,  CoaUtion  of  Automated  Mathematics  and 
Science  Education  Databases.  The  vision  we  have  for  the  teacher  is  one-stop-shopping. 
You  submit  a  search  against  a  database  and  that  gets  translated  in  the  WAIS  metaphor 
across  multiple  databases.  The  ideal  situation  is  our  database  or  other  databases  know 
about  databases  that  the  teacher  or  student  doesn't  know  about,  and  you  get  results  back 
from  places  you  don't  know  about.  In  other  words,  you  find  things  you  weren't  looking 
for.  That  is  the  real  power  and  utility  of  having  interoperable  systems  and  of  having 
compatible  formats. 

If  I  were  going  to  highlight  one  word  on  this  diagram  it  would  be  API  which 
stands  for  Application  Programming  Interface.  That  is  the  hooks  by  which  you  get  into 
the  proprietary  side  of  the  search  engines  and  the  database  architectures.  Absent  that 
API  you  move  into  a  vendor  supplied  and  controlled  situation.  As  we  look  for  ways  to 
migrate  to  Z39.50,  we  are  finding  that  having  vendors  provide  that  API  is  extremely 
important  because  that  is  how  we  interface  with  the  search  or  database  engines  that  are 
developed  with  considerable  expense  and  time  by  the  vendors.  We  are  optimistic  that 
those  APIs  are  going  to  be  available. 

I  think  that  the  important  thing  for  me  to  emphasize  before  Nancy  speaks  about 
some  of  the  issues  that  we  are  starting  to  face  in  implementing  a  Z39.50  environment  is 
that  we  see  ourselves  as  the  reverse  kind  of  clearing  house.  The  old  kind  of  clearing 
house  is  a  hierarchy  where  everything  is  stored  and  which  you  must  physically  transport 
yourself  to  access  and  use  the  information.  Many  times  when  people  start  these  kinds  of 
activities,  they  think  of  it  very  hierarchically.  Connect  to  us  and  we'll  get  you  to  the  rest 
of  the  world.  But  that  all  falls  apart  when  you  try  to  start  building.  We  really  see  this  as 
a  distributed  network  environment  with  a  variety  of  information  providers,  a  variety  of 
services  we  can  connect  to,  some  of  which  will  be  commercial,  others  which  will  be 
generated  by  federal  agencies,  and  still  others  by  voluntary  support.  We  see  this  as  a 
way  of  strengthening  and  enriching  the  materials  which  are  available  for  K-12  teachers 
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and  students,  and  as  a  genuine  experiment  in  delivering  needed  information  services 
across  national  networks. 

Nancy  is  going  to  speak  now  about  some  of  the  issues  we  are  dealing  with  right 
now  in  moving  toward  Z39.50. 

NANCY  O'HANLON: 

As  Len  said  I  am  going  to  talk  more  specifically  about  the  database  we  are 
building  at  the  Eisenhower  Clearinghouse.  The  primary  database  we  are  building  is  a 
bibUographic  database  or  a  catalog.  In  a  way  it's  a  mutant  bibUographic  database.  First 
of  all,  we  are  going  to  be  including  all  sorts  of  media  in  our  catalog  so  that  all  kinds  of 
things  besides  printed  materials  will  be  described.  We  hope  to  do  that  in  a  multimedia 
format,  to  present  more  than  just  print  information  back  to  the  user  of  the  catalog.  One 
of  the  things  that  we  hope  to  do  is  to  provide,  at  least  for  those  who  have  the  right  kinds 
of  client  software  and  the  right  kinds  of  workstations,  linked  images.  Some  of  the  kinds 
of  images  that  might  be  useful  in  terms  of  evaluating  material  and  deciding  whether  to 
go  to  the  next  step  of  acquiring  are  tables  of  contents,  perhaps  samples  of  the  chapters. 
That  gets  back  to  the  issue  of  deaUng  with  the  producers  of  the  material  and  acquiring 
permission  to  present  those  images.  But  we  do  expect  that  there  will  be  a  lot  of  interest 
and  that  we'll  also  be  deaUng  with  some  pubUc  domain  information. 

Some  of  the  images  would  be  of  printed  objects  or  pages,  some  of  the  images 
might  be  photographs  because  many  of  the  materials  in  our  collection  will  be  objects 
themselves  or  manipulative  materials  that  are  useful  for  teaching  math  or  models, 
different  kinds  of  equipment  and  objects.  While  a  printed  description  is  helpful  and  it's 
needed  to  be  able  to  search  and  retrieve  that  record,  actually  seeing  a  photographic 
image  of  the  materials  is  much  more  useful  in  making  a  decision  about  it.  So  one  of  the 
things  that  we'll  be  looking  at  doing  is  actually  providing  digitized  photos  that  would  be 
linked  to  some  of  the  catalog  records  wherever  it  seems  to  be  most  useful.  Another  type 
of  image  is  video  clips  taken  from  those  videos  that  seem  to  lend  themselves  to  it  or 
seem  to  need  a  little  more  introduction  than  just  the  printed  record.  And  down  the  road 
we'll  offer  some  sound  files.  That  sort  of  enhancement  we  hope  to  be  presenting  to  a 
large  number  of  users. 

The  other  kind  of  enhancement  relates  to  the  value  of  information.  One  of  the 
things  that  we've  been  mandated  to  do  as  part  of  our  contract  for  this  project,  and  one  of 
the  things  that  makes  the  project  most  interesting,  is  to  incorporate  not  only  information 
that  is  already  out  there,  what  people  have  said  in  reviews  and  the  studies  educators 
have  done  to  try  to  determine  the  effectiveness  of  the  materials,  but  also  what  teachers 
are  saying  about  the  niaterials  as  they  try  to  use  them  in  their  classrooms.  So, que  of  the 
ways  this  database  may  be  mutant  is  that  it  will  be  interactively  built  by  the  users  along 
with  the  staff  of  the  clearinghouse.  So  we  hope  to  be  collecting  evaluative  information 
from  users,  translating  that  into  a  usable  format,  and  linking  or  attaching  that 
information  to  the  records. 
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I  also  want  to  talk  about  the  issues  related  to  WAIS  because  one  of  the  servers 
that  we'll  be  making  available  is  the  WAIS  server.  We've  experimented  with  the  free 
WAIS  server  and  are  planning  to  bring  up  the  WAIS,  Inc.,  server  but  we  haven't  gotten 
that  far  in  our  development  yet.  So  some  of  the  things  I'll  be  discussing  may  actually  be 
statements  or  questions  that  we  have  about  what  the  capabilities  of  that  particular 
product  are  that  when  we  bring  it  up  we'll  discover  that  we're  happy  with  the  results  and 
anybody  who  has  had  experience  out  there  that  could  help  us,  we'd  be  glad  to  hear  about 
it  Some  of  the  things  that  we're  interested  in  providing  in  terms  of  content  and  quahty, 
content  relates  to  kind  of  information  that  we'd  Uke  to  provide,  and  quality  relates  to  the 
kinds  of  searches  that  we  allow  users  to  do  and  also  to  the  kind  of  results  that  they  get 
back. 

For  the  bibhographic  part  of  the  database,  we  think  that  the  Boolean  and  literal 
phrase  search  features  and  the  ability  to  do  truncation  within  the  WAIS  environment  are 
very  helpful.  The  hyper-search  kind  of  capability  which  allows  you  to  find  an  item  and 
run  that  against  the  database  and  find  other  items  like  it  is  certainly  a  feature  that  we 
like  a  lot. 

Other  things  that  we  are  looking  for  and  are  not  sure  yet  how  they  are  going  to 
play  out  in  our  own  development  are  fielded  search  because  the  records  are  structured 
and  because  it  may  not  be  useful  for  a  teacher  who  is  looking  for  materials  for  fifth 
graders  to  run  a  search  for  the  word  "fifth"  and  get  back  records  with  the  addresses  of 
publishers  on  Fifth  Avenue,  and  we've  all  had  that  kind  of  experience  in  our  own 
searching  So  we  think  that  the  ability  to  be  able  to  run  a  search  against  specific  fields  is 
an  important  one  and  that  it  helps  to  provide  abetter  quality  product.  The  other  thing 
that  we're  really  interested  in  doing  is  being  able  to  display  entries  from  our  thesaurus, 
our  controlled  vocabulary,  in  order  to  enable  the  teacher  or  user  to  do  a  better  search, 
to  not  have  to  guess  about  the  terminology  that  is  in  the  database  but  to  be  able  to  see 
representations  of  what  the  terms  are.  Again,  it's  not  clear  at  this  point  what  the 
capabilities  of  WAIS  will  be  in  this  regard,  but  this  is  certainly  a  feature  that  we'll  be 
looking  for. 

My  last  transparency  has  to  do  with  the  quality  of  searches  and  navigation  issues. 
Because  we  will  be  providing  access  to  other  databases  besides  our  catalog  through  the 
client  or  the  user  interfaces  that  we  provide,  we  have  to  exercise  some  control  at  that 
level  even  though  it  is  an  open  system,  in  terms  of  collection  development  and  what 
databases  we  make  available  and  how  relevant  they  are  to  our  user  commumty.  Beyond 
that  though  there  will  still  be  problems  related  to  consistency  that  I  think  we  are  just 
starting  to  be  aware  of  both  in  terms  of  the  kind  of  data  that  are  contained  m  the 
database  and  certainly  the  vocabulary  and  the  indexing  structures.  In  some  ways  WAIS 
provides  a  nice  leveling  faciUty  in  that  you  are  not  coping  with  lots  of  different  kinds  ot 
database  search  engines  so  there's  a  predictabiUty  in  terms  of  the  response  and  I  think 
that's  one  of  the  attractive  features  related  to  trying  to  bring  up  a  group  of  databases 
together. 
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We  are  looking  to  Z39.50  to  help  us  in  terms  of  attributes  and  identifying  fields  in 
a  fielded  search  kind  of  environment  that  may  have  disparate  labels.  The  bottom  line  is 
that  the  producers  of  the  database  may  have  employed  different  terminology  despite  our 
best  efforts  to  try  to  work  together  to  use  a  common  vocabulary.  In  order  to  provide  a 
quality  search,  we  need  to  find  a  way  to  overcome  those  disparities.  One  of  the  issues 
we're  looking  at  is  the  issue  of  a  meta-thesaurus  since  we  would  have  presumably  a  little 
more  control  and  knowledge  about  the  databases  that  we  would  provide  access  to 
through  the  clearinghouse.  At  some  level  we  think  that  we  may  be  able  to  do  some 
tinkering  in  order  to  improve  the  consistency  of  search  results  that  come  back  to  users  by 
doing  mapping  of  vocabulary.  Again,  we'll  be  looking  to  see  what  kind  of  support  there 
might  be  within  the  WAIS  environment  for  us  to  do  that. 

I  think  I  may  have  posed  more  questions  than  given  answers  at  this  point.  I  hope 
that  the  next  time  that  we  come  back  to  talk  to  you  we'll  actually  have  something  to 
show  you  and  maybe  some  answers. 
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NETSURFING  -  SHOOTING  THE  CURL 


Wayne  Allen 
EINET 


I  work  for  EINET  in  Austin,  Texas,  which  is  part  of  MCC.  First,  let  me  tell  you  a 
little  bit  about  what  EINET  is  and  what  we  do.  EINET  stands  for  Enterprise  Integration 
Network.  We  provide  secure  client  server  appUcations,  secure  communications, 
commercial  transaction  support,  virtual  private  network  support,  and,  more  relative  to 
this  gathering,  network  information  navigation. 

We've  been  looking  at  the  navigation  problem  very  seriously  because  we  feel  that 
is  one  of  our  most  important  value-added  elements  to  our  network  services.  Our 
customers  tell  us,  and  we  concur,  that  navigation  of  information  on  the  network, 
particularly  the  Internet  information  that's  freely  available,  has  to  really  be  easy.  So 
what  we're  doing  is  trying  to  adopt  a  role-based  navigation  in  which  users  play  roles,  very 
much  like  when  they  assume  their  work  role  of  doctor  or  lawyer,  or  a  role  assumed  just 
for  fun  But  these  roles  are  flexible.  A  user  has  to  be  able  to  change  roles,  as  a 
manager  at  one  point  and  some  other  kind  of  role  at  another.  You  need  to  be  able  to 
customize  roles,  and  the  data  that  the  user  sees  should  change  depending  on  what  role 
he  adopts.  The  reason  that  we  do  this  is  that  our  main  target  is  the  commercial  world, 
and  most  of  the  commercial  world  is  still  standing  on  the  beach.  You  don't  have  to 
teach  them  how  to  be  expert  surfers  or  how  to  shoot  the  curl,  you  just  have  to  get  them 
into  the  water.  Easy  navigation  is  the  way  to  do  that.  We  think  that  the  surfer  down 
inside  them  will  take  over  after  -  that  is  our  theory. 

So  the  kind  of  information  we  need  to  provide  for  them,  and  what  we  are  looking 
at  initially,  is  the  information  available  over  the  Internet.  It  comes  m  many  different 
forms  There's  raw  data  available  through  traditional,  older  means  such  as  FTP  and 
Gopher.  Then  there  is  more  structured  data  which  actually  provides  some  mechanism 
for  searching  such  as  Veronica  which  is  a  way  of  searching  Gopher  space,  World-Wide 
Web,  X.500,  WAIS,  Archie,  whois,  and  a  number  of  others. 

Then  there's  information  about  information.  Some  good  examples  that  you  can 
find  in  the  Web  if  you  want  to  see  some  interesting  databases  is  the  UUNNA 
Meta-Library  at  MIT,  World-Wide  Virtual  Library  at  CERN,  O'Reilly  and  Associates 
has  something  called  "The  Whole  Internet  Resource  Catalog"  which  is  very  interesting, 
and  the  Library  of  Congress,  although  I'm  not  sure  you  can  classify  this  as 
meta-iMormatYon,  has  just  built  a  Gopher  server  called  LC  MARVEL  which  you  should 
look  at  as  it  really  is  quite  a  marvel. 

So  in  order  to  provide  role-based  navigation,  which  we  are  really  just  starting  to 
look  at,  we  have  to  idemify  both  raw  meta-information  on  the  network  and  we  have  to 
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create  role-based  search  spaces  so  that  the  navigation  can  proceed  without  the  confusion 
of  semantic  heterogeneity  among  the  lexicon  of  the  database.  In  other  words,  if  you 
have  a  database  that's  all  about  baseball,  then  when  you  ask  about  Babe  Ruth,  you're 
not  going  to  get  confused  by  the  answers  that  you  get.  But  to  just  index  and  search 
through  raw  data  from  the  Internet  is  practically  worthless. 

So  this  is  what  I  mean  about  searching  across  protocols.  Protocols  are  typically 
boundaries  of  search  spaces  right  now.  If  I  want  to  find  something  about  forestry,  I  go  to 
the  Gopher  search  space  and  I  look  in  the  Gopher  world.  Then  I  go  to  WAIS  and  look 
in  the  WAIS  world.  Then  I  go  elsewhere  and  search  that  world.  We  don't  want  to  be 
that  difficult  to  deal  with.  We  want  to  just  look  for  forestry,  and  find  information  about 
it  regardless  of  which  access  protocol  space  it's  in. 

Lastly,  as  a  network  provider,  we  don't  want  to  bring  all  this  information  into  our 
site  to  give  to  people  when  they  find  it.  Therefore,  we  have  to  deal  mainly  with 
references  to  information. 

I  don't  know  how  to  describe  this  slide  because  we're  not  very  certain  of  this 
ourselves.  This  gives  a  general  picture  of  how  we're  going  to  proceed  in  our  experiment. 
I've  built  an  experimental  WAIS  server  and  indexer  that  understands  uniform  resource 
locators  (URLs)  which,  as  been  explained  earHer  this  morning,  is  the  means  to  span 
protocols  through  the  address  of  a  piece  of  information  or  a  service  or  a  document. 
Basically  we're  going  to  build  raw  WAIS  databases  from  the  various  protocol  access 
spaces  such  as  the  Gopher  databases,  World-Wide  Web  databases,  and  so  forth.  And 
then  we  are  going  to  extract  these  URLs  using  some  standard  taxonomy  or 
categorization  mechanism  from  these  raw  databases,  identifying  the  documents  which 
belong  in  each  topic  or  category,  regardless  of  where  they  live  and  through  which 
protocol  they  are  accessed.  Then  we  are  going  to  integrate  more  meta-information 
sources  in  with  that,  the  raw  data,  and  create  topical  rule-based  databases  which  are 
WAIS  searchable  and  will  return  references  to  remote  information. 

We  now  know  a  Kttle  more  about  how  we  are  going  to  deliver  this  information 
because  people  are  going  to  be  looking  for  this  information  through  a  variety  of  means. 
Obviously,  we  are  going  to  provide  WAIS  cUents.  We  are  making  WAIS  clients  for  the 
MAC  and  PC  available  toward  the  end  of  this  month  as  shareware.  When  you  access 
information  with  a  WAIS  cUent,  typically  a  user  will  get  the  document  that  he  asked  for 
because  the  WAIS  server  will  gateway,  get  that  document  through  whatever  protocol  it  is 
accessed  by,  and  return  it  via  Z39.50,  That  makes  most  of  the  world  of  the  other 
protocols  available  to  simple  WAIS  client  users.  For  example,  if  what  is  referenced  in 
the  database  is  a  Gopher  menu,  what  the  server  will  do  is  turn  that  iiito  a  WAIS  patalog 
which  WAIS  cUents  know  to  look  at  and  let  you  select  from.  Then  there  are  cases  where 
what  is  foupd  is  actually  a  service,  a  telnet  interactive  service,  for  example.  In  this  case, 
the  server  will  return  a  types  document  containing  the  URL  so  the  client  can  launch  the 
appropriate  application  to  use  the  service. 
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For  Gopher  clients,  the  user  will  basically  see  what  any  Gopher  client  would  see, 
a  Gopher  item  list.  The  same  is  true  for  Wold- Wide  Web  clients. 

The  peculiarity  of  this  database  is  that  it  will  have  nothing  to  do  with  the  clients 
per  se.  The  database  is  going  to  be  smart  in  how  it  finds  and  returns  data  to  users. 

This  is  a  very  difficult  undertaking  for  a  number  of  reasons.  One  is  the  immense 
amount  of  raw  data  out  there.  There  are  over  1,400,000  entries  in  Gopher  space,  only 
some  small  portion  of  which  are  Gopher  menus.  Most  of  those  entries  are  references  to 
real  data.  To  even  index  the  titles  of  the  menus  is  an  immense  chore,  which  is  what  a 
Veronica  index  is.  The  amount  of  space  in  X.500  is  immense.  We  have  an  X.500  walker 
which  goes  out  and  finds  things  in  X.500.  We  WAIS  index  it  and  throw  the  data  away 
because  we  can't  afford  to  keep  it  all.  The  problem  is  that  we  need  to  organize  as  a 
community  to  find  ways  of  making  meta-information,  information  about  the  raw  data, 
available  so  that  navigation  services  can  be  provided  to  find  the  data.  There  is  no  way 
the  current  protocols  can  support  that.  The  new  WAIS  protocol,  for  example,  has  a 
feature  that  allows  you  to  inquire  about  the  nature  of  the  database  itself.  That  is  needed 
in  all  of  the  different  protocols. 

If  you  are  interested  in  talking  to  me  further  about  this  or  in  helping  with  it,  send 
mail  to  me.  For  more  general  information  About  EINET,  send  a  message  to 
EINET-INFO@EINET.NET.  For  information  about  EINET's  WAIS  clients,  send  a 
message  to  WAIS-TALK@EINET.NET. 
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AN  INTELLIGENT  WAIS  INTERFACE 
Lucian  Russell 

Director,  Advanced  Computer  Applications  Center 
Argonne  National  Laboratory 

SLIDE  1:  Advanced  Applications  for  Saving  Time  and  Money 

*  Argonne  National  Laboratory  believes  that  new  technology  (e.g.  Cooperative 
Answering)  can  provide  this  enhancement  to  WAIS. 

*  New  public  domain  technology  can  be  used  to  incorporate  Thesauri  into  search 
engines  without  spending  $50,000/copy  for  software. 

*  Argonne  believes  that  by  carefully  exploiting  public  domain  technology  from 
Universities,  the  benefits  of  advanced  Information  Retrieval  can  be  provided  to 
government  agencies  TODAY. 

SLIDE  2:  An  Intelligent  WAIS  Interface  Project:  Overview 

*  Extend  the  WAIS  CUent  that  assists  the  user  in  retrieving  documents  from  the 
server  (add  new  terms). 

*  Define  for  the  WAIS  Server  how  to  identify  relevant  documents  (increase 
precision). 

*  Help  minimize  UNNECESSARY  TRANSFER  of  irrelevant  data  across  the 
Internet. 

SLIDE  3:  Future  Headlines? 

*  "Danger:  WAIS  falls  victim  to  its  own  success!" 

*  "The  Information  Highway  a  Victim  of  Government  WAISte!" 

*  "User  WAISting  away,  waiting  for  data" 

*  "The  Future  of  WAIS:  Waiting  for  KNOWBOT' 
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SLIDE  4:  The  WAIS  Architecture:  Public  Domain  Version  Beta  Release 
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SLIDE  5:  WAIS  Directory:  Locates  WAIS  Servers  via  Internet 

*  Locates  WAIS  servers  that  claim  to  have  relevant  files/information. 

*  Relevance  Feedback  a  function  of  the  Server's  "Intelligence". 

*  Current  "Pubhc  Domain"  server  ranks  "relevance"  by  a  count  of  keywords. 

*  Nobody  judges  adequacy  of  servers,  either  content  or  relevance  ranking. 
SLIDE  6:  WAIS  can  be  used  for: 

*  Finding  repositories  of  files  containing  relevant  information. 

*  Filtering  documents  (files)  based  upon  a  user  profile, 

*  Transmitting  relevant  documents  to  the  local  database  (Client)  for  later  perusal 


*  Avoiding  excess  work 'and*  cost! 
SLIDE  7:  The  WAIS  Dilemma:  Precision  vs.  Recall 

*  Favoring  Precision:  a  customized  preprocessor  to  the  Client  can  potentially 
improve  precision  of  retrieved  documents. 

*  Benefit:  the  Client  can  sort  and  sift  and  analyze  at  the  user's  leisure. 

*  Drawbacks: 

--  excessive  traffic  due  to  unneeded  files  being  shipped  around  the  world. 
—  extra  cost  from  servers  with  access  charges 

*  We  would  be  better  off  mth  greater  precision  at  the  WAIS  server. 


and  use. 
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SLIDE  8:  The  WAIS  dilemma:  Client  or  Server  based  "Intelligence" 

*  Clients  may  retrieve  documents  using  one  or  more  search  methods: 

~  Boolean 

~  statistical  relevance 

~  thesaurus-based 

*  Benefits:  full  capability  of  commercial  products  are  available. 

*  Drawback:  High  costs,  (e.g.  $50,000)  of  client  software  discourages  use. 
SLIDE  9:  Project  Goals  for  the  Intelligent  WAIS  (IWAIS)  Interface 

*  Investigate  the  effectiveness  of  merging  intelligent  database  technology  with 
other  areas  of  information  technology  to  better  control  the  precision  of  access  to  WAIS 

documents.  u    r  ♦ 

*  Determine  what  types  of  intelligent  processing  can  be  performed  on  the  chent. 

*  Determine  what  types  of  intelligent  processing  can  be  performed  on  the  server. 
SLIDE  10:  The  WAIS  Architecture:  IWAIS  Client 
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SLIDE  11:  The  WAIS  Architecture:  IWAIS  Server 
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SLIDE  12:  The  IWAIS  Interface:  Project  Environment 

*  The  CARMIN  system:  a  logic  program  implementing  cooperative  answering 
techniques  for  enhancing  query  processing  in  deductive  databases. 

*  An  initial  semantic  net:  The  Environment,  Safety  and  Health  (ES&H) 
Thesaurus  developed  by  the  DOE  Office  of  Scientific  and  Technical  Information  (OSTI) 
becomes  part  of  a  deductive  database. 

*  ES&H  documents  from  the  DOE  developed  for  the  Facility  Profile  Information 
Management  System  (FPIMS). 

SLIDE  13:  Project  Plan 

*  Build  a  semantic  net  from  the  OSTI  Thesaurus  (broader  terms,  narrower  terms, 
related  terms). 

*  AUGMENT  THE  SEMANTIC  NET  WITH  ADDITIONAL  PREDICATES 
AND  FACTS. 

*  Generate  a  logic  model  from  the  semantic  net. 

*  Perform  a  precision/recall  comparison  test  with  and  without  the  cooperative 
answering  preprocessing  of  queries. 
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SLIDE  14:  Program  Auspices:  Division  of  Educational  Programs  (DEP) 

*  The  Department  of  Energy  has  special  programs  for  college  students  to  obtain 
research  experience. 

*  The  Advanced  Computer  Applications  Center  (ACAC)  selected  two  students  to 
participate  in  the  IWAIS  evaluation  this  summer. 

*  DEP  pays  for  the  students'  participation. 

*  Guidance  is  given  on  an  as-needed  basis  by  specialists  in  logic  programming. 
SLIDE  15:  Hypothesis  under  Test 

*  Using  Cooperative  Answering  will  improve  precision/recall:  more  appropriate 
documents  will  be  retrieved. 

*  Conventional  methods  fail  to  find  all  relevant  documents,  so  additional 
methods  are  needed  to  give  more  meaning  to  searches. 

»  Adding  knowledge  about  the  real  world  in  the  form  of  additional  predicates 
(assertions  about  facts  and  relationships)  augment  concept  based  searching  and 
retrieving. 

SLIDE  16:  Thesaurus  methods  provide  only  a  start! 

*  Knowing  that  the  "Department  of  Energy's  Secretary"  is  a  "Cabinet  Officer" 
doesn't  help  you  find  that  the  Secretary's  name  is  "Hazel  O'Leary." 

*  Knowing  that  Argonne  National  Laboratory  is  geographically  "nearer"  to  Oak 
Ridge  National  Laboratory  than  Brookhaven  National  Laboratory  (USGS  WAIS  Server 
information)  will  not  tell  vou  that  Argonne  and  Brookhaven  "report  to"  the  DOE  field 
office  in  Chicago  whereas  Oak  Ridge  does  not. 

SLIDE  17:  Generating  Cooperative  Answers  to  Queries 

*  Improve  human-computer  interaction  through  collaboration 

*  Interpret  queries  based  upon  user  expectations,  desires,  and  interests 

*  Detect  and  correct  misconceptions  in  user  queries 

*  Present  answers  to  queries  such  that  they  are  understandable  by  the  user 
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SLIDE  18:  Relaxation  Technique 

*  Experimenting  with  relaxation  technique  used  in  the  CARMIN  system 
(prototype  cooperative  answering  interface) 

*  Work  interactively  with  the  user  to  find  alternative  answers  related  to  the 
answers  of  the  original  query 

*  Use  taxonomy  clauses  to  relax  predicates,  constants,  and  variable  dependencies 
in  the  query  to  expand  the  query's  search  space 

SLIDE  19:  Example 

*  User:  Which  DOE  sites  have  material  science  departments. 

*  Relaxation  steps: 

"(1)  split  query  into  key  terms  (e.g.  DOE  site,  material,  science, 
department) 

--(2)  askuserifhe/shewantstorelaxanyof  the  terms,  for  example, 
"DOE  site" 

SLIDE  20:  Example  (continued) 

-(3)  generate  query  template  for  each  term  in  (2)  from  taxonomy  (e.g.  a 
taxonomy  based  upon  domain-specific  thesaurus) 

#  specialization:  "national  laboratory" 

#  generalization:  "site" 

#  synonym:  "location" 

--(4)  pass  query  template  (i.e.,  modified  query)  to  retrieval  or  database 
system  (e.g.,  WAIS) 

SLIDE  21:  Original  Experiment  Evaluation 

•  Compare  documents  retrieval  by  WAIS  server  &  their  relevance  ranking:  Boolean 
searches  before  and  after  additional  terms  are  added. 

*  Problem:  public  domain  SUN  server  WAIS  does  riot  support  Boolean  search. 

*  Problem:  public  domain  SUN  server  WAIS  only  uses  single  keyword  count  as 
a  "relevance"  ranking  criterion. 
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SLIDE  22:  Current  Evaluation  Plan 

*  Obtain  additional  terms  from  IWAIS  using  the  Cooperative  Answering 
Technique. 

*  Construct  Boolean  queries  using  new  terms 

»  Perform  search  in  the  Facilities  Profile  Information  Management  System  using 
the  "Personal  Librarian"  Information  Retrieval  tool. 

*  Evaluate  precision  recall  differences  with  two  methods,  Boolean  search  and 
Conceptual  search  (statistical  methods). 

SLIDE  23:  WAIS  Server  Conjecture 

*  Separate  databases  of  Thesauri  terms  AND  databases  of  facts  and  knowledge 
(predicates)  could  be  stored  on  a  per-user  basis. 

*  Alternative  relevance  ranking  criteria  could  be  considered  as  independent 
"dimensions"  for  ranking: 


--  Statistical  measure 

~  Thesauri  enhanced  measures 

--  Fact/predicate  database  measures 


SLIDE  24:  Conclusion 


*  Haste  in  implementing  servers  will  make  for  a  WAIS-full  use  of  network 


resources. 


*  Maximizing  relevance  criteria  options  for  servers  is  a  necessity. 

*  Providing  an  intelligent  means  to  merge  data  from  subject-  matter  databases 
thesauri  will  increase  search  and  retrieval  capabilities  beyond  those  provided  by  Ian 


based  tools. 
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USE  OF  WAIS  FOR  DATABASE  ANALYSIS 


David  Icove 

Federal  Bureau  of  Investigation 


My  name  is  David  Icove,  I'm  program  manager  at  the  FBI's  National  Center  for 
the  Analysis  of  Violent  Crime.  That's  located  in  Quantico,  Virginia,  that's  at  our  FBI 
training  academy  facility.  If  you  saw  the  movie  the  "Silence  of  the  Lambs",  my  hallway 
was  where  it  was  filmed  so,  some  of  the  computer  equipment  that  they  featured  m  the 
film,  I  was  in  the  process  of  surplusing  it,  but  had  I  known  that  I  would  have  given  it  to 
them    but  I  want  to  thank  the  Library  of  Congress  as  well  as  the  other  WAIS 
participants,  as  well  as  my  friends  at  the  U.S.  Geological  Survey  that  have  helped  us 
along  just  recently  in  the  new  endeavor  we  call  "Project  Matchup."  USGS  provided  us 
their  version  that  has  provided  us  with  some  untold  results  which  this  is  the  first  form  of 
its  kind  in  the  pubUc  sector  which  we  are  going  to  be  able  to  brag  about  it. 

My  objective  today  is  to  give  a  very  short  talk  but  complete  for  you  but  basically  I 
want  to  tell  you  about  two  things  or  three  things  right  now  as  far  as  the  existence  of  the 
National  Center  for  Analysis  of  Violent  crime    we  call  it  the  "National  Center  -  as  a 
member  of  the  WAIS  community;  don't  look  on  the  Internet  for  "FBI.gov  ;  you  are  not 
going  to  find  us  but  we  are  there;  but  we  are  there  in  a  private  network. 

We  are  also  using  the  WAIS  technology  to  not  only  search  for  information  but 
also  to  match  together  what  we  consider  to  be  vary  sparse  databases.  And  the  types  ot 
crimes  that  we  look  at  are  unsolved  violent  crimes  that  occur  throughout  the  Umted 
States. 

The  third  thing  is  that  I  wanted  to  at  least  address  a  few  items  with  you  as  far  as 
our  research  and  developmem  efforts  that  we  perceived  to  be  important  in  the  future. 

To  give  you  a  Uttle  background,  the  National  Center  is  basically  a  fusion  center, 
it's  an  all-source  collection  center  for  all  information  about  violent  crimes  that  occur 
within  the  United  States.  And  the  types  of  violent  crimes  we  look  at  are  those  done  by 
serial  offenders,  that  is  offenders  that  use  jurisdictional  boundaries  to  their  advantage  to 
evade  detection  from  law  enforcement.  We  also  provide  advanced  training  to  Federal, 
State  local  and  foreign  law  enforcement  officers  in  the  area  of  using  advanced 
technology  for  solving  crimes  internationally.  And  we  also  do  our  own  research  and 
development.  If  you  sort  of  take  a  look  at  it,  the  National  Center  basically  is  like  a 
mini-laboratory  facility.  We're  presented  with  real  life  problems  we  address  them,  we 
take  a  look  at  them  and  in  cases,  have  to  go  out  to  find  the  technolo©'  to  solve  vei7 
realistic  crime  problems,  very  accurately  portrayed,  and  like  I  said  before  the  USGS 
provided  us  at  least  with  the  "intro"  and  their  experience  m  helping  us  with  this  problem. 
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We  also  have  found  in  recent  audits  of  our  own  information  that  the  existing 
crime  analysis  technologies  that  we  had  were  inferior  and  that  they  did  not  provide  the 
resuhs  that  we  wanted.  They  failed  basically  to  extract  a  lot  of  the  information  that  we 
had  about  our  offenders  and  needed  to  support  our  investigations,  and  the  bottom  Ime 
is  at  least  based  on  my  evaluation  and  contractors  that  have  taken  a  look  at  what  we  are 
doing,  that  the  Z39.50-WAIS  standard  appears  to  be  the  way  of  the  future  for  cnme 
solving. 

Like  I  mentioned  before  our  basic  problem  is  in  looking  at  serial  offenders  that 
use  jurisdictional  boundaries.  We  have  problems  and  if  you  see  some  of  the  popular 
literature  and  read  some  of  the  newspapers  on  a  daily  basis  you  are  gomg  to  see  that 
serial  violent  offenders  are  both  something  of  the  past  but  will  be  a  part  our  future  They 
use  jurisdictional  boundaries,  they  use  the  facts  that  they  can  evade  detection  from  law 
enforcement  by  crossing  boundaries  and  going  from  one  jurisdiction  to  another,  and  what 
they  do  is  prevent  law  enforcement  [agencies]  from  talking  to  one  another  as  well  as 
sharing  these  facts  and  cross  referencing  the  information.  And  to  take  a  look  at  what  we 
used  to  call  the  modus  operandi,  the  MO  of  the  offense,  we  look  at  it  as  "crime 
signatures".  The  basis  for  this  is  that  we  needed  good  techniques  to  extract  not  only  the 
method  of  operation  but  also  these  signatures  from  the  databases  and  the  databases  that 
we  had  are  definitely  very  sparse. 

On  the  tests  that  we  conducted,  these  were  the  data  sources  that  we  looked  at,  I 
felt  like  a  cook,  I  threw  a  little  Associated  Press  wire  service  "traffic"  in  there  we  have  a 
foreign  broadcast  information  service  which  tells  us  what's  going  on  world  wide,  VlCAl' 
is  our  unsolved  homicide  database.  Those  are  unsolved  homicides  that  come  m  to  us 
throughout  the  country  by  law  enforcement  agencies  that  said  "Hey,  we  can't  solve  this 
homicide  or  this  group  of  homicides.  It  must  have  been  done  by  an  offender  who  has  left 
and  fled  our  jurisdiction  and  moved  to  another  one,  so  what  can  you  do  to  try  to  match 
that  case  with  all  the  other  cases  you  (the  FBI)  may  have  had  reported  to  us.    We  also 
have  the  National  Firearms  Reporting  System.  The  program  that  I  manage  is  the  Arson 
and  Bombing  Investigative  Services.  We  look  at  all  arson  and  bombing  cases  throughout 
the  United  States.  We  get  a  100,000  arsons  and  bombings  a  year,  withm  the  U.S.  it  s  just 
an  amazing  total,  85%  of  those  cases  are  not  solved.  So,  we  have  about  85,000,  ot  those, 
about  25  000  of  those  are  motor  vehicle  fires.  So  we  have  a  sparse  database  there  but  we 
have  a  lot  to  do  and  anytime  you  see  a  good  major  serial  arson  or  bombing  case  we  ve 
got  it  We  are  always  constantly  massaging  these  databases  to  try  to  determine  has  this 
event  happened  in  the  past.  We  also  have  bomb  incident  reports  that  come  from  our 
bomb  data  center  and  we  have  a  fire  incident  reporting  and  we  have  National  Incident 
Base  reporting  and  then  what  we  have  called  NCIC  "bolof '  messages,  "Be  On  the 
Lookout  For"  messages  and  these  seem  to  be  the  sparsest  of  all  the  databases  we  have 
and  what  I  call,  basically  "messages  in  the  bottle". 
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I  used  to  work  for  a  police  department  and  what  happened  was  when  you  couldn't 
solve  a  case,  what  you  did  was  you  went  down  to  the  NCIC  Center  that  we  had  and  you 
wrote  a  little  message,  and  it  said:  "Some  fellow  was  in  our  jurisdiction  and  he  has  done 
this,  this  and  this  and  he  is  no  longer  here  and  if  anybody  has  any  crimes  similar  to  it 
please  let  us  know."  And  we  would  send  it  out,  hoping  that  somebody  else  would  read  it. 
And  I  would  go  down  every  day  and  read  the  messages  that  came  out  from  other 
agencies  and  then,  say  well  I  may  have  it  and  one  day  I  picked  up  a  message,  and  we  just 
hid  a  case  like  this,  and  I  went  back  down  to  take  a  look  at  the  clip  board  and  the 
message  was  gone.  And  the  NCIC  clerk  said,  "Hey,  if  you  didn't  keep  a  copy  of  it,  it  s 
gone."  So  what  we  have  been  doing  is  archiving  these  messages. 

So  based  on  that,  we  went  to  the  next  step.  We  went  to  our  users  and  said  (there 
are  three  programs  that  we  have  down  there,  arson  and  bombing,  unsolved  homicides 
and  we  have  people  who  do  basic  criminal  investigative  analysis  on  general  crimes  and  I 
said)  to  them,  "What  do  you  want?  (And  these  people  are  not  computer  hterate.)  So  they 
said-  "We  don't  warn  any  shck  and  shallow  software".  They  were  intuitive,  werent  they. 
They  had  limited  computer  skills,  they  wanted  to  be  able  to  access  multiple  databases  at 
a  single  stroke.  They  said:  "We  don't  want  to  have  to  go  in  and  m  and  hit  and  hit 
different  databases."  (Which  is  what  was  going  on.)  (And  you  can  see  the  picture  getting 
closer  and  closer  to  the  WAIS  solution  here.)  They  wanted  intuitive  user  interfaces  and 
they  also  wanted  to  search  and  match  similar  cases. 

In  the  FBI  we  have  a  great  deal  of  time  spent  indexing  information.  We  can  look 
up  anything.  If  we  found  a  stolen  typewriter:  "Yup,  we  can  find  it;  Yup,  this  is  where  it 
was  stolen  from",  and  MCIC  had  it  or  we  had  it  in  one  of  our  files.  But  if  you  went  back 
and  said  "alright,  we  have  had  a  series  of  bombings  over  the  last  10  years  and  this  has 
been  the  MO,  this  is  the  target,  this  is  the  "victimology",  this  is  the  damage  that  was  done 
and  this  is  the  area  of  the  country  that  we  want  to  concentrate  m,'  we  had  ng  way  ot 
going  back  to  these  databases  and  saying:  "Given  this  constraint,  given  this  relevance 
feedback,  tell  us  similar  cases  just  like  this."  So  this  is  the  impetus  of  why  we  started 
"Project  Match-up". 

We  wanted  national  standards,  I  went  to  our  procurement  people  and  especially 
our  computer  people,  our  IRM  folks,  and  they  said,  "Hey,  we  don't  want  you  doing 
anything  unless  it  has  the  word  'GOSIF  next  to  it."  All  the  open  standards  and 
everything  else  and  I  said  "OK,  you're  the  boss,"  and  so  I  went  back  and  said  I  think 
that  these  might  adhere"  and  we  wanted  to  have  a  Client-Server  relationship. 

So  here  are  some  results:  Right  off  the  bat,  we  went  out  and  got  froin  USGS  a 
couple  of  versions  of  WAIS  and  finally  we  settled  on  theirs  because  we  liked  the  clean 
interface  as  well  as  the  boolean  logic;  put  some  simple  queries  m  there  and  got  some 
simple  answers:  we  put  cases  in  there  that  we  knew  matched  and  we  got  back,  matches 
but  what  really  bothered  me:  we  were  getting  back  matches  on  other  cases  that  we  were 
unaware  of.  The  known  cases  that  were  matching  matched  very  closely;  new  information 
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received  on  active  cases;  the  relevance  feedback  scores  were  very  high,  in  a  lot  of  the 
cases,  and  we  were  able  to  see  that  once  they  fell  off,  exactly  where  the  cut  points  were. 
We  also  structured  our  databases;  basically  we  "tokenized"  a  lot  of  the  information  that 
we  put  in  and  we  were  able  to  fool  the  system  to  do  much  better;  so  we  learned  a  lot 
and  we  have  some  other  text  retrieval  projects  at  the  National  Center  that  we  oversee  to 
look  at  "Concept-Based  Reasoning".  We  would  look  at  legal  reasoning,  we  look  at 
concepts  that  would  understand  who  the  President  of  the  United  States  might  be.  And 
we  appUed  that  logic,  appended  those  concepts  to  the  end  of  some  of  the  messages  that 
we  had  and  threw  that  back  through  WAIS.  And  it  increased  its  useability  that  much 
higher.  So  a  large  fraction  of  the  text  we  saw  was  really  going  unseen. 

To  give  you  some  statistics  on  what  was  going  on.  We  took  our  unsolved  homicide 
database  which  basically  we  had  over  200  Unkages,  200  sets  of  known  serial  cases  and  on 
the  average  there  were  4  homicides  for  each  of  the  case  of  the  200  some  cases,  ran  it 
through  and  we  looked  and  we  said  all  right  let's  try  it  out.  So  we  started  and  the  analyst 
said,  "Oh  no,  you're  cheating  because  the  same  poUce  officer  filled  out  some  of  the  same 
reports",  so  we  said  fine  those  are  intra-state  reports,  we'll  do  the  inter-state,  which 
basically  represented  two  or  more  states  in  the  different  reports  so  unless  a  police  officer 
quit  one  department  and  went  to  go  work  for  another  one  and  filled  out  the  same  report, 
the  intra-rated-reliability  issue  was  pretty  well  washed  away.  From  those,  77%  of  the 
cases  we  were  able  to  match  back  to  there  original  pool  of  known  cases.  The  only  reason 
that  the  77%  wasn't  higher  was  there  were  new  cases  coming  in  that  they  were  unaware 
of  altogether  and  that  was  pushing  the  known  matches  down  into  a  lower  relevance 
scores.  If  we  added  the  intra-state  cases,  we  would  push  it  past  90%.  So  these  are 
definitely  interesting  aspects  that  we  have  seen. 

As  far  as  the  future  aspects  at  the  National  Center,  we  definitely  see  that  there  is 
going  to  be  more  effective  use  of  computer  technology,  information  technology  in  some 
of  the  areas.  We  already  have  departments  coming  to  us  saying  "Hey,  we  have  xhese 
large  sources  of  textual  information  in  our  databases  as  well  as  geographically  based 
information,  what's  the  next  step,  what  should  we  do?"  And  they  are  asking  the  same 
questions  we  are:  we  don't  care,  we  don't  want  to  look  up  similar  cases,  we  want  to 
match  similar  cases  based  on  MO.  So  the  other  thing  WAIS  fits  the  bill  on,  is  the  fact 
that  we  can  pass  that  technology  on  to  another  agency  and  we  are  not  barred  from  the 
normal  issues  of  keeping  it  within  Federal  government,  we  can  extend  out  to  state,  local 
as  well  as  foreign  law  enforcement  agencies.  And  that's  our  true  mission  at  the  center, 
basically,  sharing  and  retrieving  technology  as  well  as  investigative  skills,  to  help  us  out. 
So  if  it's  an  efficiency-in-government  issue,  its  a  TQM,  we're  not  quite  sure  what  it  is,  but 
basically  we  meet  the  quidelines  as  far  as  enforcement  response  and  cooperation.  We 
definitely  intend  to  do  further  cooperative  research  with  our  friends  at  USGS  as  well  as 
DOE  and  some  of  the  other  Federal  and  local  agencies  in  striving  to  enhance  the 
concept  of  what's  being  applied.  And  I  know  that  a  lot  people  were  surprised  when  I  first 
let  them  know  that  we  were  actually  applying  this  technology  to  solving  crimes  even 
internally  within  the  FBI  and  they  said,  "What!  We've  never  even  heard  of  this.  And 
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what  are  you  doing?"  So  I  said,  "Well,  it's  only  two  years  old  so  we  wouldn't  expect  you 
to  know  too  much." 

We  are  also  looking  for  implementations  of  images  and  multi-media.  A  lot  of  the 
information  we  have  especially  with  the  retention  plan  of  the  National  Center,  bemg  a 
fusion  center,  we  are  like  a  Library  of  Congress,  but  for  violent  crime.  And,  our 
retention  period  is  50  years  or  the  life  of  the  program.  So  our  records-management 
people  ask  me  at  one  time  do  you  think  you  will  need  more  than  50  years  retention 
period  and  I  said,  "Well,  call  me  back  then  and  I  will  let  you  know. 

But  images  are  a  great  deal  of  the  information  we  have,  because  we  can  tell  alot 
from  a  crime  scene  and  it's  vary  laborious  to  go  back,  and  say  ,  'Tou  know,  I  got  a  case 
just  like  that  in  the  past",  and  go  try  to  find  the  aging  photographs,  the  yellowing 
documents  that  we  have  so,  as  far  as  wide  area  information  servers,  we  would  almos 
want  to  consider  wide  area  information  sharing  because  of  the  ability  for  us  to  quickly 
retrieve  and  access  the  information  that  we  have.  And  basically,  also,  the  handling  as  far 
as  databases  which  we  have  to  realize,  as  far  as  our  mission  is  concerned  basically  the 
"buck  stops  here".  The  National  Center  was  established  basically  to  handle  the  cases 
that  were  cast  off  by  law  enforcement.  The  cases  that  can't  be  solved  and  basically  what 
we  are  showing  is  that  technology  like  WAIS,  when  it's  applied  correctly  can  take  low 
solvability  information,  sparse  databases  and  actually  solve  cases  using  the  technology.  So 
in  conclusion,  I  thought  that  this  quote  would  fit  well  that,  "We  are  drowmng  m 
information,  but  we  are  starved  for  knowledge,"  from  my  favorite  book. 

Any  questions? 

QUESTION  1-  A  question  on  recidivism,  you  may  have  cases  which  are  closed 
on  persons  caught  and  apprehended  -  who  go  to  jail  for  some  time  but  then  come  out 
and  resume  -  so  you  have  a  case  there  where  the  case  is  closed,  might  be  archived,  but 
of  use,  and  ought  to  be  tied  in  when  certain  offenders  are  released. 

ANSWER  1-  That's  a  good  question  regarding  recidivism.  We  do  a  lot  of  things 
about  recidivism,  one  is  that,  we  go  out  and  interview  the  experts  are  not  the 
investigators  but  the  actual  criminals  themselves,  so  we  go  to  the  prison  systems  and  seek 
out  the  people  most  likely  to  exercise  recidivism  and  actually  do  full  mtemews  and  we 
ask  them  questions  like  nobody  ever  asked  before,  like,  "How  did  you  get  caught?  .  Why 
did  vou  get  caught?"  and  "Would  you  ever  do  this  again?"  as  well  as  a  lot  of  research 
protocol  issues  Ind  when  I  said  we  are  basically  like  an  R&D  corporation,  that  s  one 
step  ahead.  So  for  example,  from  a  person  who  we  interview  or  a  case  we  worked 
Srically  in  the  past  or  cases  that  are  submitted  by  multiple  agencies  hat  don't  realize 
that  we  are  looking  at  the  same  offender.  We  have  seen  those  types  of  ms  ances  m  the 
past  so  the  issue  of  recidivism  is  very  near  and  dear  to  our  hearts.  I  haven't  proved  yet 
whether  or  not  there  is  a  good  predictor  for  repeat  human  behavior  and  m  the  foture  - 
once  an  offender  gets  out  of  jail  or  one  of  there  cases  are  resolved.  That  s  a  good 
question.  Thank  you. 
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OTJHSTION  2:  [inaudible] 

ANSWER  2:  We  are  just  taking  the  data  and  placing  it  in  flat  file  format,  directly 
in,  we've  exploited  every  technique  that  you  can  use  in  WAIS.  What,  we  have  done  is 
"pre-iconized"  the  information.  For  example,  a  location  is  also  blended  into  its  item 
number,  location  and  name  itself  so  when  WAIS  indexes  it,  it  just  looks  like  a  token,  so 
we  have  been  able  to  force  it  to  be  more  accurate  than  it  has  been  m  the  past.  We  just 
place  it  in  a  flat  file  and  I  have  these  contests:  the  National  Center  is  m  a  basement. 
(We  sometimes  call  it  the  "National  Cellar".)  We  are  trying  to  see  how  large  a  database 
we  can  force  through  WAIS,  before  it  collapses.  And  at  the  last  database  that  I  picked 
up,  I  thought  was  going  to  be  the  largest,  we  had  11,000  incidents  of  bombings  and  then 
I  went  through  another  incident  database  of  interviews  and  we  indexed  about  a  150UUU 
interviews  in  the  database  and  it  responded  just  as  well  as  my  little  11,000  item  database 
for  the  bombing  incidents  or  5000  for  something  else,  so  we  had  not  been  able  to 
"collapse"  the  system.  We  end  up  with  some  very  large  indices  but  the  system  works  tme 
and  we  are  very  happy.  As  I  said  we  are  very  happy  to  be  a  member  of  the  commumty, 
too,  because  we  get  a  lot  of  the  information  just  from  some  of  these  forums. 

mJESTIQN  2:  [continued,  inaudible] 

ANSWER  2-  [continued].  I  have  no  memory  size  problems  we  are  using  an  old, 
antiquated  VAX  and  it  seems  to  be  coming  along  just  fine.  We've  also  started  using  it  on 
a  DEC  workstation  and  a  SUN  platform,  in  comparing  and  contrasting  the  speed,  but 
time  is  relevant  when  you  work  in  a  basement  and  I  just  set  it  and  let  it  mdex  over  the 
weekend  and  when  I  come  back,  on  Monday  morning,  it's  there. 

miESTION  3:  [inaudible] 

ANSWER  3-  This  is  still  an  experiment.  What  we  have  done  is  to  show  that  we 
can  match  our  existing  cases.  To  put  this  in  a  timeframe,  we're  talking  less  than  3  months 
from  the  time  we  received  the  software  from  USGS  to  this  date  that  we  have  been  using 
the  technology  and  like  I  said  we  have  aheady  we've  already  had  outside  agencies  come 
in  to  take  a  look  at  it,  we've  had  contractors  come  in  and  look  at  what  we  are  doing,  and 
saving  "We  don't  know  why  you  are  doing  so  well"  and  I  say  "Neither  do  I  but  I'm 
keepSg  my  fingers  crossed."  But,  I'm  sure  that,  I  saw  that  little  novel  newsletter  about 
the  WAIS  clips  in  the  front,  there,  and  I'm  sure  that  I'll  be  more  than  glad  to  send  you 
the  results  of  from  our  first  case  when  we  get  some  positive  feedback. 

Other  than  that,  I  want  to  thank  you  very  much,  I'm  going  to  be  around  a  little 
later  on,  if  anybody  has  any  questions  that  they  want  to  ask  that's  fine.  Thank  you  very 
much. 
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DowVision 


Greg  Gerdy 

Assistant  Director  of  DowVision  Services 
Dow  Jones  &  Co.,  Inc. 


Good  afternoon.  I  hope  everybody's  holding  up  OK? 

I  had  a  little  bit  of  apprehension  coming  in,  I  know  that  Dow  Jones  is  a  big  fan  of 
WAIS  big  fan  of  the  Internet,  and  probably  a  year  ago,  they  may  not  have  asked 
anybody  from  a  commercial  firm  like  Dow  Jones  to  speak  here.  But  in  fact  there  are 
some  historical  reasons  why  you  might  have,  anyway  at  that  time,  and  over  the  course  of 
the  past  year,  things  have  started  to  open  up  for  the  commercial  world. 

But  I  wanted  to  share  a  couple  of  stories  that  relate  to  Brewster  Kahle,  Dow  Jones, 
Apple  and  Peat  Marwick  and  Thinking  Machines,  because  I've  been  m  the  lucky  position 
to  be  both  the  product  manager  of  ongoing  products  but  also  had  some  land  of 
responsibility  for  new  business  development,  or  new  product  development  I  ve  been  able 
to  see  WAIS  both  as  a  slice  in  time  of  what's  in  the  market,  and  also  what  s  coming. 

Let  me  give  you  an  example.  We  have  had  DowQuest  available  since  1988,  and 
of  course,  I've  been  able  to  see  user  reactions  -  I'm  a  user  myself    1  m  a  big  ^n  of  it 
for  half  of  my  searching,  the  other  half  I  use  our  full  text  database  But  for  ha^f  of  my 
searches,  and  fuzzy  kind  of  searching,  it's  fantastic.  But  also  around  the  same  time  we 
started  up  with  a  couple  of  differem  flavors.  First  of  all  our  executive  VP,  a  the  time 
Bill  Dunn  had  set  up  a  little  skunk  works,  in  his  office,  using  WAIS  comiected  back  to 
this  static  database,  I  don't  think  it  was  DowQuest,  but  it  simulated  a  live  session  where 
the  oiSy  pointers  nimiing  from  his  Mac  down  to  our  system  downstairs,  and  i  was  sort  of 
saving  "this  is  really  going  somewhere."  And  he  would  bring  people  m  and  out  of  his 
office  sho^ng  people  where  this  was  going.  And  then  we  had  the  project  with  Apple, 
™Sing  Mac'hLs'  KPMG  which  although  it  didn't  totally  come  t^gethe^^^^^^^ 
product  all  the  separate  elements  wem  their  own  way  and  are  becoming  things  we  are 
Sf  on  our  own  on  DowVision,  which  I  will  talk  about  in  a  second,  KPMG  has  become  a 
b  g  Mac  -erand  is  evaluating  DowVision,  Apple  went  on  and  did  -me  ^^^^^ 
reporter  software  and,  of  course,  Brewster  went  out  and  ^^^^^^^.^^^^^J.'!^^^^^^^^^^^ 
of  exciting,  it  didn't  come  together  as  you  might  have  envisioned  m  those  first  meetings. 

Anyway,  I  was  asked  to  talk  about  DowVision,  today,  and  I  will  do  that,  but  I 
hope  a"gathering  in  the  future  I  have  a  chance  to  talk  a  little  bit  more  about  it  as 
it  relates  to  the  new  WAIS. 
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Anyway,  DowVision  is  a  commercial  service  from  Dow  Jones  that's  been  available 
for  a  couple  of  years  now.  It's  a  fully  integrated  information  service  that  combines 
components  of  broadcast  delivery  of  real-time  news  and  of  interactive  retneval  back  to 
our  host  services,  specifically  Dow  Jones  News  Retrieval.  It  runs  over  a  proprietary  X.25 
network  and  is  assembled  at  the  customer  site  by  Alliance  Developer  software  that 
provides  a  client-server  solution  for  the  customer. 

Taking  a  look  at  a  schematic,  the  most  exciting  thing  for  me,  being  part  of  this,  is 
that  you  start  with  the  world  wide  news  gathering  of  Dow  Jones,  where  reporters  are  out 
there  filing  stories,  and  Hterally,  electronically  it  moves  aU  the  way  through  the  system 
right  to  the  profile  on  your  desktop.  So  in  a  sense  you  have  900  Dow  Jones  reporters,  AP 
reporters,  with  whom  we  have  a  joint  relationship  and  all  the  other  people  contnbutmg 
these  new  sources  getting  directly  to  your  desktop  essentially  m  real  time.  So  it  s  an 
exciting  concept  having  them  all  work  for  you. 

To  give  you  an  idea  of  the  sources  that  are  available,  it  starts  with  "The  Wall 
Street  Journal"  delivered  every  day  at  2am,  same  day  delivery,  and  since  we  allow  you  to 
store  it  for  up  to  six  months,  this  becomes  a  pretty  important  archive  for  business  people 
who  often  have  a  shorter  time  frame  as  far  as  information  requirements.  But  it  also 
contains  a  range  of  business  and  financial  news  not  only  from  Dow  Jones  but  third  party 
sources,  such  as  PR  News  Wire  and  Business  Wire. 

I  think  that  one  of  the  most  important  things  about  it  is  that  it  is  integrated  into 
the  way  that  you  work.  The  lesson  I  got  from  what  Bill  Dunn  had  demonstrated,  using 
WAIS  technology  a  number  of  years  ago,  was  that  the  system  worked  well  m  his  desktop 
environment.  And,  in  fact,  we  work  with  third  party  developers  who  solve  problems  for  a 
wide  range  of  platforms,  operating  systems,  hardware,  network  environments  ~  it  s  pretty 
well  covered. 

For  the  commercial  user  we  broke  down  another  barrier,  in  that  the  information 
is  priced  on  a  flat-fee  basis  so,  now,  you  can  actually  budget  for  it.  Many  of  you  are  used 
to  using  Internet  and  WAIS  for  information  that  you  don't  necessarily  pay  for  -  although 
you  always  pay  for  it,  one  way  or  another  -but  the  biggest  obstacle  in  the  commercial 
market  has  been,  "I  can't  predict  how  much  this  is  going  to  cost  me,  get  me  a  flat-fee 
price",  so  we  have  done  that  and  it's  proved  to  be  very  popular. 

The  other  thing  we've  learned  is  nobody  in  the  commercial  market  is  sitting 
around  waiting  for  real-time  news  to  come  to  their  desk  like  in  the  securities  industry 
However,  we've  already  been  able  to  attract  the  wide  range  of  job  functions  that  you  see 
listed  here  and  a  sUde  I  don't  have  here,  a  wide  range  of  industries,  everything  from 
pharmaceuticals,  to  high  tech,  to  public  relations,  to  oil  and  gas  firms  ener©^  firn^^i  W^^^ 
are  already  seeing  the  interest  in  this  kind  of  delivery  system  mirror  the  kinds  of  use  that 
you  would  see  in  a  Dow  Jones  News/  Retrieval  or  some  of  the  other  services,  like 
Dialog  and  NEXIS. 
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In  fact  I  would  just  like  to  share  a  few  war  stories  because,  some  of  them  are  very 
good,  there  are  others,  even  more,  that  I  am  not  going  to  mention.  And  we  feel  good 
that  we  have  proved  that  the  concept  works.  Much  like  WAIS  is  coming  along  and 
proving  that  information  retrieval  across  a  wide  spectrum  can  work,  if  the  software  is 
good  much  the  same  we  feel  that  the  concept  of  delivering  into  the  corporate  market  is 
starting  to  prove  itself  out.  For  example  there's  a  company,  in  New  Jersey,  a  pubhc  utihty 
company,  where  they  had  people  hterally  sitting  there,  fUpping  through  papers,  clipping 
articles,  putting  it  into  this  huge  stack,  and  circulating  it  around  the  company.  Well 
somebody  said  that,  "this  is  nuts,  why  are  we  doing  this?!?"  So  they  brought  the  system 
in  and  now  all  their  executives  are  receiving  DowVision  on  their  desktop.  It's  been  great 
for  us,  of  course,  penetrating  the  corporation  and  getting  our  news  on  people's  desks. 
The  problem  is  the  other  half  of  the  company  was  described  by  their  representatives  as 
being  "out  of  Jurassic  Park",  so  we're  never  going  to  reach  those  people,  because  they 
never  look  at  computers.  But  that's  OK. 

We  also  have  mixed  environments;  there's  a  very,  very  large  software  company 
where  the  executives  get  the  full  blown  system  and  the  other  people  in  the  orgamzation 
-  a  couple  thousand  of  them  --  get  E-mail  delivery  of  the  news. 

We  also  have  -  this  is  my  favorite  story  because  I  heard  it  first  person  --  there's  a 
group  that  we're  working  with  that  has  DowVision  on  a  firm  wide  basis,  and  they  saw  a 
piece  of  news  come  in,  three  or  four  o'clock  on  a  Friday  afternoon,  that  one  of  their 
subsidiaries  on  the  West  coast  was  in  trouble.  They  put  together  a  "SWAT  team  and 
went  out  over  the  weekend  and  by  Monday  they  had  saved  the  business.  So  a  system  like 
this  is  going  to  pay  for  itself  with  one  incident  like  that.  But  what  we're  really  trying  to 
do  --  and  I  don't  have  to  tell  any  of  you  this  because  you  are  in  the  field,  but  if  you  can 
get  people  and  users  to  develop  a  regular  habit  of  keeping  up  with  their  news,  their 
business  news,  in  this  case,  they're  going  to  be  better  informed  and  better  able  to  deal 
with  the  competitive  environment. 

And  just  one  other  measure,  then,  of  how  we're  doing  for  those  of  you  who  might 
have  heard  about  what  we're  doing,  as  I  said,  we've  been  available  commercially,  two 
years  now,  we  have  over  100  customers,  our  user  count  is  getting  close  to  about  30,000, 
that's  registered  users,  and  it's  growing  at  a  nice  pace. 

Finally  I  think  -- 1  don't  have  to  tell  any  of  you  this  --  but  it  is  something  that  we 
are  trying  to  get  our  customers  to  understand:  that  to  really  operate  in  the  nineties,  the 
external  information  that  you  need  to  know  is  becoming  much  more  important  in 
running  your  business,  because  the  cycle  times  are  shorter  and  on  and  on  -  you  have 
heard  all  the  reasons  --  and  we  are  now  trying  to  starting  to  build  up  some  good 
evidence  that  in  the  corporate  market,  a  solution,  like  the  one  that  we  have,  is  very 
critical  to  that. 
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Specifically,  what  comes  beyond  this,  our  interactive  connection  back  to  Dow 
Jones  News/Retrieval  and  DowQuest,  is  nearly  completed;  our  development  is  done, 
some  of  our  AlUance  Developers,  people  like  Desktop  Data  and  Verity,  some  of  them, 
wHl  be  coming  along  with  solutions  to  tie  back.into  our  host  service.  One  of  the  thmgs 
that  gets  raised  in  discussing  the  issue  of  WAIS  is,  well  "Where  will  Dow  Jones  go  with 
this*^"  Well,  number  one,  we  know  the  technology  very  well  and  we  are  very  impressed 
with  it.  Number  two,  having  a  host  service  with  over  fourteen  hundred  textpubhcations, 
we're  always  interested  in  what  the  latest  retrieval  technologies  might  be.  Third,  having 
set  up  our  own  network  and  looking  at  other  ways  to  distribute  information,  the  Internet 
itself  is  very  attractive.  So  with  that  in  mind,  you  can  imagine  -  although  I  can  t  say 
much  more  than  that  -  we  are  looking  at  this  very  closely  because  it's  a  very  important 
collection  of  technologies  that  we  would  like  to  see  come  together  and  help  our 
customer  base  as  well  as  attract  new  customers. 

Anyway,  thanks  for  the  opportunity  to  present  today  and  I  would  hope  again  some 
time  in  the  future,  we  might  have  even  more  to  say. 

Any  questions? 
OTTRSTION  1:  inaudible. 

ANSWER  1:  Our  systems  development  group  is  looking  closely  at  Z39.50.  I 
don't  have  anything  specific  that  I  can  talk  about  today,  but  as  I  said,  in  the  context  of 
my  earlier  comments,  we're  watching  it  very  closely,  because  we  know  it's  important. 
Thank  you. 
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INTRODUCTION 


Advanced  digital  network  development  is  a  multifaceted  enterprise  encompassing 
a  wide  range  of  activities  and  issues.  Expansion  and  modernization  of  the  nation's 
information  infrastructure  is  central  to  this  effort.  Equally  important,  however,  are 
strategies  to  ensure  that  this  infrastructure  supports  national  priorities  for  education 
and  research  and  development  initiatives. 

This  compilation  summarizes  selected  legislation  in  the  103rd  Congress  on 
electronic  information  delivery.  These  legislative  proposals  affect  numerous  policy 
issues,  including  fostering  infrastructure  development,  applying  the  fruits  of  that 
development  to  schools  and  libraries,  enhancing  public  access  to  government  electronic 
information,  and  protecting  intellectual  property  in  a  networked  environment. 

We  have  used  six  topic  headings:  Infrastructure  Development,  Government 
Information,  Educational  Applications,  Library  Applications,  Health  Services,  and 
Privacy  and  Intellectual  Property.  Although  some  bills  could  be  Usted  under  several 
headings,  we  have  tried  to  place  each  one  under  what  we  perceive  to  be  its  primary 
purpose. 
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H.R.  707  (Dingell) 
Companion  Measure  S.335 

Emerging  Telecommunications  Technologies  Act  of  1993 
Amends  the  National  Telecommunications  and  Information  Administration 
Organization  Act  to  require  the  Assistant  Secretary  of  Commerce  for  Communications 
and  Information  and  the  Chairman  of  the  Federal  Communications  Commission  (FCC) 
to  conduct,  at  least  biannually,  and  to  report  to  specified  congressional  committees,  the 
FCC,  and  the  Secretary  of  Commerce  annually  on,  joint  electromagnetic  spectrum 
planning  with  respect  to:  (1)  future  spectrum  requirements  for  public  and  private  uses 
and  the  allocation  actions  to  accommodate  those  uses;  and  (2)  actions  to  promote  the 
efficient  use  of  the  spectrum. 

Introduced  February  2,  1993.  Referred  to  the  House  Committee  on  Energy  and 
Commerce,  Subcommittee  on  Telecommunications  and  Finance.  Passed  House  on  March 
2,  1993.  Received  in  Senate  and  Referred  to  Senate  Committee  on  Commerce,  Science 
and  Transportation  on  March  3,  1993. 

These  bills  are  incorporated  in  H.R.  2264,  the  Omnibus  Budget  Reconciliation 
Act  of  1993. 


H.R.  1091  (Clinger) 

A  bill  to  establish  the  Commission  on  Information  Technology  and  Paperwork 
Reduction.  Introduced  on  February  24,  1993.  Referred  to  House  Committee  on 
Government  Operations,  Subcommittee  on  Legislation  and  National  Security. 
Establishes  the  Commission  on  Information  Technology  and  Paperwork  Reduction  in 
order  to  minimize  the  information  reporting  burden  imposed  by  the  Federal 
Government,  consistent  with  the  information  needs  of  the  Government  for  policy 
purposes.  Lists  specific  Commission  functions,  which  include  the  study  and  revie\y  of 
former  Commission  on  Paperwork  recommendations  for  paperwork  reduction.  Requires 
a  final  Commission  report  to  the  Congress  and  the  President  and  action  by  the  Office 
of  Management  and  Budget  on  Commission  recommendations. 


H.R.  1312  (Boucher) 

Local  Exchange  Infrastructure  Modernization  Act  of  1993  Abito 
recognize  the  unique  status  of  local  exchange  carriers  in  providing  the  public  switched 
network  infrastructure  and  to  ensure  the  broad  availabili^  of  advanced  public  switched 
network  infrastructure. 

Amends  the  Communications  Act  of  1934  to  require  the  Federal  Communications 
Commission  (FCC)  to  exercise  its  authority  to:  (1)  preserve  and  enhance  universal 
telephone  service  at  reasonable  rates;  (2)  achieve  universal  availability  of  advanced 
network  capabilities  and  information  services;  (3)  assure  a  seamless  nationwide 
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distribution  network  through  joint  network  planning,  coordination,  and  service 
arrangements  between  and  among  local  exchange  carriers  (LECs);  (4)  maintain  high 
standards  of  quality  for  advanced  network  services;  and  (5)  assure  adequate 
communication  for  the  public  health,  safety,  defense,  education,  national  security,  and 
emergency  preparedness. 

Introduced  on  March  11,  1993.  Referred  to  House  Committee  on  Energy  and 
Commerce,  Subcommittee  on  Telecommunications  and  Finance  and  to  the  House 
Committee  on  the  Judiciary,  Subcommittee  on  Economic  and  Commercial  Law. 

H.R.  1504  (Boucher) 
Communications  Competitiveness  and 
Infrastructure  Modernization  Act  of  1993 

A  bill  to  encourage  the  modernization  of  the  Nation's  telecommunications 
infrastructure,  to  promote  competition  in  the  cable  television  industry  and  to  permit 
telephone  companies  to  provide  video  programming.  Introduced  on  March  29,  1993. 
Referred  to  House  Committee  on  Energy  and  Commerce,  Subcommittee  on 
Telecommunications  and  Finance. 

Amends  the  Communications  Act  of  1934  to  allow  a  common  carrier  to  provide 
video  programming  directly  to  subscribers  in  its  telephone  service  area  through  its  own 
facilities  or  an  affiliate.  Authorizes  the  common  carrier  to  provide  channels  of 
communications,  pole  line  conduit  space,  or  other  rental  arrangements  to  any  entity 
which  is  directly  or  indirectly  owned,  operated,  or  controlled  by  it  if  such  facilities  or 
arrangements  are  to  be  used  for,  or  in  connection  with,  the  provision  of  video 
programming  directly  to  subscribers  in  the  telephone  service  area  of  the  common 
carrier. 

Prohibits  a  common  carrier  from  providing  video  programming  directly  to 
subscribers  in  its  telephone  service  area  unless  the  programming  is  provided  through 
a  separate  video  programming  affiliate.  Requires  business  arrangements  and 
transactions  between  a  common  carrier  and  its  video  programming  affiliate  to  be 
pursuant  to  regulations  prescribed  by  the  Federal  Communications  Commission  and  to 
be  without  cost  to  the  telephone  service  ratepayers  of  the  carrier. 

Requires  any  common  carrier  which  provides  video  programming  directly  to 
subscribers  through  an  affiliate  in  its  telephone  service  area  to  establish  a  basic  video 
dial  tone  platform. 

Requires  such  common  carrier  to  make  a  maximum  of  75  percent  of  the  equipped 
capacity  of  its  basic  video  dial  tone  platform  available  to  unaffiliated  video  program 
providers.  States  that  the  carriage  of  local  broadcast  signals  shall  not  constitute  the 
provisions  of  affiliated  video  programming  under  this  Act. 

Sets  forth  prohibitions  on:  (1)  cross-subsidization  between  telephone  service  and 
video  programming  by  common  carriers;  and  (2)  common  carrier  buyouts  of  cable 
systems  located  in  the  carrier's  telephone  service  area. 

Requires  the  Commission  to  convene  a  Federal-State  Joint  Board  to  establish 
practices,  classifications,  and  regulations  necessary  to  ensure  proper  jurisdictional 
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separation  and  allocation  of  the  costs  of  providing  broadband  services,  including 
affiliated  video  programming. 

Makes  provisions  of  this  Act  inapplicable  to  video  programming  provided  in  a  rural 
area  by  a  common  carrier  that  provides  telephone  exchange  service  in  such  area. 


H.R.  1613  (Collins,  C.) 

Telecommunications  Policy  Coordination  Act  of  1993 

A  bill  to  improve  coordination  in  the  formulation  of  telecommunications  policy 
within  the  executive  branch.  Introduced  on  April  1,  1993.  Referred  to  House 
Committee  on  Energy  and  Commerce,  Referred  to  Subcommittee  on 
Telecommunications  and  Finance. 

Establishes  an  Office  of  Telecommunications  Policy  (OTP)  in  the  Executive 
Office  of  the  President.  Directs  OTP  to:  (1)  prepare  national  telecommunications  policy 
options;  (2)  serve  as  the  principal  advisor  to  the  President  on  telecommunications 
issues;  (3)  arbitrate  telecommunications  policy  disputes  among  Federal  agencies;  (4) 
communicate  the  views  of  the  agencies  and  the  President  concerning 
telecommunications  policy  to  the  Federal  Communications  Commission  (FCC)  and  the 
Congress;  and  (5)  monitor  developments  in  telecommunications  technology.  Requires 
the  Director  of  OTP  to:  (1)  establish  an  Advisory  Committee  on  Telecommunications 
Policy;  and  (2)  report  to  the  President  and  the  Congress  annually  on  OTP  activities  and 
on  emerging  trends  in  telecommunications. 

Requires  the  FCC  to  report  to  the  President  and  the  Congress  on  its  reasons  for 
taking  any  final  action  which  is  inconsistent  with  views  received  from  OTP. 

H.R.  1757  (Boucher) 

High  Performance  Computing  and  High  Speed  Networking  Applications 
Act  of  1993 

National  Information  Infrastructure  Act  of  1993 

A  bill  to  provide  for  a  coordinated  federal  program  to  accelerate  development  and 
dissemination  of  applications  of  high  performance  computing  and  high-speed 
networking.  Introduced  April  21,  1993.  Referred  to  House  Committee  on  Science, 
Space,  and  Technology,  Subcommittee  on  Science.  Hearings  held  by  Subcommittee  on 
May  6  and  11,  ordered  and  reported  as  amended  on  June  30,  1993. 

Amends  the  High-Performance  Computing  Act  of  1991  to  direct  the  Federal 
Coordinating  Council  for  Science,  Engineering,  and  Technology  to:  (1)  establish  an 
interagency  applications  program  to  develop  applications  of  computing  and  networking 
advances  under  the  National  High-Performance  Computing  Program;  and  (2)  develop 
a  Plan  for  Computing  and  Networking  Applications  which  shall  identify  application 
program  goals  and  priorities  and  set  forth  specific  Federal  agency  responsibilities. 

Requires  the  Plan  to:  (1)  foster  local  network  access  programs  and  their 
connection  with  Internet;  and  (2)  develop  projects  and  technologies  in  the  fields  of 
education,  health  care,  libraries,  and  government  information  access. 
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Provides  for  the  establishment  of  a  high  performance  computing  and  applications 
advisory  committee. 

H.R.  2264  (Sabo) 

Companion  Measure,  S.  1134 

Omnibus  Budget  Reconciliation  Act  of  1993 

Licensing  Improvement  Act  of  1993 

♦Emerging  Telecommunications  Technologies  Act  of  1993 

A  bill  to  provide  for  reconciliation  pursuant  to  section  7  of  the  concurrent 
resolution  on  the  budget  for  fiscal  year  1994.  Incorporates  H.R.  707  and  S.  335. 
Introduced  on  May  25,  1993.  House  Committee  on  the  Budget  Reported  an  Original 
Measure,  report  No:  103-111.  Passed  House  on  May  27,  1993,  Passed  Senate  in  lieu  of 
S.  1134  on  June  25,  1993,  Senate  also  requested  a  conference. 


S.  4  (HoUings) 

Companion  Measure  H.R.  820 

Manufacturing  Technology  and  Extension  Act  of  1993 
Information  Infrastructure  and  Technology  Act  of  1992 
National  Competitiveness  Act  of  1993 
Wind  Engineering  Program  Act  of  1992 

A  bill  to  promote  the  industrial  competitiveness  and  economic  growth  of  the 
United  States  by  strengthening  and  expanding  the  civilian  technology  programs  of  the 
Department  of  Commerce,  amending  the  Stevenson-Wydler  Technology  Innovation  Act 
of  1980  to  enhance  the  development  and  nationwide  deployment  of  manufacturing 
technologies,  and  authorizing  appropriations  for  the  Technology  Administration  of  the 
Department  of  Commerce,  including  the  National  Institute  of  Standards  and 
Technology.  Introduced  on  January  21,  1993.  Referred  to  Senate  Committee  on 
Commerce,  Science  and  Transportation.  Committee  hearings  held  February  24,  1993 
and  March  25,  1993.  Ordered  and  reported  as  amended  on  May  25,  1993. 

TITLE  III--CRITICAL  TECHNOLOGIES,  Sec.  313.  Technical  Amendments 

...  (3)  The  Office  of  Technology  Monitoring  and  Competitive  Assessment 
is  authorized  to  (A)  act  as  a  focal  point  within  the  federal  government  for  the  collection 
and  dissemination,  including  electronic  dissemination,  of  information  on  foreign  process 
and  product  technologies,  including  information  collected  under  the  Japanese  Technical 
Literature  Program;  (B)  coordinate  the  extensive  foreign  technology  monitoring  and 
assessment  activities  already  under  way  in  the  federal  government;  (C)  act  as  an 
electronic  clearinghouse  for  this  information  or  otherwise  provide  this  function 
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S.  335  (Innouye) 

Companion  Measure  H.R.  707 

Emerging  Telecommunications  Technologies  Act  of  1993 
A  bill  to  require  the  Secretary  of  Commerce  to  make  additional  frequencies 
available  for  commercial  assignment  in  order  to  promote  the  development  and  use  of 
new  telecommunications  technologies.  Introduced  on  February  4,  1993.  Referred  to 
Senate  Committee  on  Commerce,  Science  and  Transportation,  Subcommittee  on 
Communications.  Hearings  held  on  March  17,  1993. 

Directs  the  Assistant  Secretary  of  Commerce  for  Communications  and 
Information  and  the  Chairman  of  the  Federal  Communications  Commission  (FCC)  to 
conduct  joint  spectrum  planning  meetings.  Directs  the  Secretary  of  Commerce  to:  (1) 
identify  bands  of  frequencies  that  may  be  reallocated  to  commercial  users;  and  (2) 
establish  a  related  advisory  committee.  Directs  the  FCC  to  submit  to  the  President  and 
the  Congress  a  plan  for  the  distribution  of  the  reallocated  banks  of  frequencies  under 
this  Act. 

Authorizes  the  President  to  reclaim  reallocated  bands  of  frequencies  for 
reassignment  to  government  stations. 

This  bill  was  also  incorporated  in  H.R.  2264,  the  Omnibus  Budget 
Reconciliation  Act  of  1993. 

S.  473  (Johnston) 

Department  of  Energy  National  Competitiveness 
Technology  Partnership  Act  of  1993 

A  bill  to  promote  the  industrial  competitiveness  and  economic  growth  of  the 
United  States  by  strengthening  the  linkages  between  the  laboratories  of  the 
Department  of  Energy  and  the  private  sector  and  by  supporting  the  development  and 
application  of  technologies  critical  to  the  economic,  scientific  and  technological 
competitiveness  of  the  United  States.  Introduced  on  March  2, 1993.  Referred  to  Senate 
Committee  on  Energy  and  Natural  Resources,  Subcommittee  on  Energy  Research  and 
Development,  Committee  hearings  held  on  March  18,  23  and  24,  1993.  Reported  to 
Senate  (Amended)  by  Senate  Committee  on  Ener©^and  Natural  Resources  On  June  24, 
1993,  report  no.:  103-69. 

Amends  the  Department  of  Energy  Organization  Act  to  authorize  the  Secretary 
of  Energy  and  the  directors  of  departmental  laboratories  (laboratories  operated  by  or 
on  behalf  of  the  Department  of  Energy  (DOE)  or  facilities  that  would  be  considered  to 
be  laboratories  under  the  Stevenson- Wydler  Technology  Innovation  Act  of  1980  )  to 
enter  into  any  partnership  that  will  enhance  the  economic,  scientific,  or  technological 
competitiveness  of  U.S.  industry. 

Directs  the  Secretary  to  develop  a  multi-year  critical  technology  strategy  for  each 
critical  technology  listed  in  the  National  Critical  Teclinologies  Report.  Authorizes  the 
Secretary  and  the  directors  of  departmental  laboratories  to  enter  into  partnerships  that 
build  on  the  core  competencies  of  the  laboratories  to  conduct  research,  development, 
demonstration,  or  commercial  application  activities  in  areas  listed  in  Report  or  in 
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energy  efficiency  or  supply,  high-performance  computing,  the  environment,  human 
health,  advanced  manufacturingtechnologies,  advanced  materials,  transportation,  space, 
or  quality  technologies,  or  technologies  listed  in  the  annual  defense  critical  technologies 
plan. 

Amends  the  High-Performance  Computing  Act  of  1991  to  provide  for  cost-shared 
projects  involving  DOE  or  DOE  laboratories  and  non-Federal  entities  to  test  and  apply 
high-performance  computing  and  high-speed  networking  technologies. 


S.  570  (Grassley) 

Local  Exchange  Infrastructure  Modernization  Act  of  1993 

A  bill  to  recognize  the  unique  status  of  local  exchange  carriers  in  providing  the 
pubhc  switched  network  infrastructure  and  to  ensure  the  broad  availability  of  advanced 
public  switched  network  infrastructure.  Introduced  on  March  11,  1993.  Referred  to 
Senate  Committee  on  Commerce,  Science  and  Transportation. 

Amends  the  Communications  Act  of  1934  to  require  the  Federal  Communications 
Commission  (FCC)  to  exercise  its  authority  to:  (1)  preserve  and  enhance  universal 
telephone  service  at  reasonable  rates;  (2)  achieve  universal  availability  of  advanced 
network  capabilities  and  information  services;  (3)  assure  a  seamless  nationwide 
distribution  network  through  joint  network  planning,  coordination,  and  service 
arrangements  between  and  among  local  exchange  carriers  (LECs);  (4)  maintain  high 
standards  of  quality  for  advanced  network  services;  and  (5)  assure  adequate 
communication  for  the  public  health,  safety,  defense,  education,  national  security,  and 
emergency  preparedness. 

Requires  the  FCC  to  prescribe  regulations  that  require:  (1)  joint  coordinated 
network  planning,  design,  and  cooperative  implementation  among  all  LECs  in  the 
provision  of  public  switched  network  infrastructure  and  services;  (2)  development  of 
standards  for  interconnection  between  the  LEC  public  switched  network  and  others  by 
appropriate  standard-setting  bodies;  and  (3)  a  LEC  to  share  public  switched  network 
infrastructure  and  functionality  with  requesting  LECs  which  serve  a  geographic  area 
for  which  they  lack  economies  of  scale  or  scope  for  the  particular  required  network 
functionality. 


S.  1086  (Danforth) 

Telecommunications  Infrastructure  Act  of  1993 

A  bill  to  foster  the  further  development  of  the  Nation's  telecommunications 
infrastructure  through  the  enhancement  of  competition.  Introduced  on  June  9,  1993. 
Referred  to  Senate  Committee  on  Commerce,  Science  and  Transportation, 
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GOVERNMENT  INFORMATION 


H.R.  629  (Owens,  M.) 

Improvement  of  Information  Access  Act  of  1993 

A  bill  to  amend  title  44,  United  States  Code,  to  promote  improved  public 
dissemination  of  Government  information.  Introduced  on  January  26,  1993.  Referred 
to  House  Committee  on  Government  Operations,  Subcommittee  on  Information,  Justice, 
Transportation  and  Agriculture. 

Requires  agencies  to:  (1)  disseminate  information  in  diverse  modes  and  through 
appropriate  outlets  that  will  permit  and  broaden  public  access  to  Government 
information;  and  (2)  use  depository  libraries,  national  computer  networks,  and  other 
distribution 

channels  that  improve  and  assure  free  or  low-cost  public  access  to  Government 
information. 


H.R.  1328  (Rose) 

Companion  Measure  S.564--P.L.  103-40 
Government  Printing  Office  Electronic 
Information  Access  Enhancement  Act  of  1993 

A  bill  to  establish  in  the  Government  Printing  Office  a  means  of  enhancing 
electronic  public  access  to  a  wide  range  of  Federal  electronic  information.  Introduced 
on  March  11, 1993.  Referred  to  House  Committee  on  House  Administration.  On  April 
1,  1993  reported  to  House  by  Committee,  report  no.:  103-51. 

Requires  the  Superintendent  of  Documents,  under  the  direction  of  the  Public 
Printer,  to  establish  a  means  for  providing  the  public  with  online  access  to  electronic 
pubHc  information  of  the  Federal  Government. 

Sets  forth  guidelines  for  determining  fees  for  accessing  such  information. 
Permits  depository  libraries  to  access  information  through  such  means  without  charge. 

Requires  the  Public  Printer  to  report  to  the  Congress  on  the  savings  resulting 
from  such  online  pubUc  access  to  government  information  and  on  the  status  of  the 
system  providing  such  access. 


S.  560  (Nunn) 

Paperwork  Reduction  Act  of  1993 

A  bill  to  further  the  goals  of  the  Paperwork  Reduction  Act  to  have  Federal 
agencies  become  more  responsible  and  publicly  accountable  for  reducing  the  burden  of 
Federal  paperwork  on  the  public,  and  for  other  purposes.  Introduced  on  March  10, 
1993.  Referred  to  Senate  Committee  on  Governmental  Affairs. 

TABLE  OF  CONTENTS 

Title  I:  Authorization  of  Appropriations 
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Title  II:  Reducing  the  Burden  of  Federal  Paperwork  on  the  Public  Title  III: 
Enhancing  Federal  Agency  Responsibility  and  Accountability  for  Reducing  The  Burden 
of  Federal  Paperwork 

Title  IV:  Enhancing  Government  Responsibility  and  Accountability  for  Reducing 
the  Burden  of  Federal  Paperwork 

Title  V:  Enhancing  Agency  Responsibility  for  Sharing  and  Disseminating  Pubhc 

Information 

Title  VI:  Additional  Government  Information  Management  Responsibility 
Requires  a  Government-wide  paperwork  reduction  goal  of  at  least  five  percent 

and  individual  agency  goals  that  aggregate  to  the  Government-wide  goal. 

Title  V:  Enhancing  Agency  Responsibility  for  Sharing  and  Disseminating  Public 

Information  -  Provides  for  Government- wide  standards  for  sharing  and  disseminating 

public  information. 

Imposes  certain  responsibilities  on  Federal  agencies  for  sharing  and 
disseminating  public  information. 

Abolishes  the  Federal  Information  Locator  System  established  in  the  Office  of 
Information  and  Regulatory  Affairs  and  replaces  it  with  a  system  in  each  agency  for 
providing  public  access  via  electronic  and  other  means  to  a  comprehensive  inventory 
of  agency  information  dissemination  products. 

Provides  for  the  use  of  electronic  information  collection  and  dissemination 
techniques  to  reduce  the  Federal  paperwork  burden. 

S.  564--P.L.  103-40  (Ford) 
Companion  Measure,  H.R.  1328 

Government  Printing  Office  Electronic  Information 
Access  Enhancement  Act  of  1993 

A  bill  to  estabUsh  in  the  Government  Printing  Office  a  means  of  enhancing 
electronic  public  access  to  a  wide  range  of  Federal  electronic  information.  Introduced 
on  March  11,  1993.  Referred  to  Senate  Committee  on  Committee  on  Rules  and 
administration.  Passed  Senate  on  March  22,  1993,  passed  House  on  May  25,  1993, 
signed  into  law  on  June  8,  1993. 

Requires  the  Superintendent  of  Documents,  under  the  direction  of  the  Public 
Printer,  to  establish  a  means  for  providing  the  public  with  online  access  to  electronic 
public  information  of  the  Federal  Government. 

Sets  forth  guidelines  for  determining  fees  for  accessing  such  information. 
Permits  depository  libraries  to  access  information  through  such  means  without  charge. 

Requires  the  Public  Printer  to  report  to  the  Congress  on  the  savings  resulting 
from  such  online  public  access  to  government  information  and  on  the  status  of  the 
system  providing  such  access. 
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S.  681  (Glenn) 

Paperwork  Reduction  Reauthorization  Act  of  1993 
Regulatory  Review  Sunshine  Act  of  1993 

A  bill  to  amend  chapter  35  of  title  44,  United  States  Code,  relating  to 
Government  paperwork  reduction,  to  modify  the  Federal  regulatory  review  process. 
To  ensure  the  greatest  possible  public  benefit  from  information  collected,  maintained, 
used,  disseminated,  and  retained  by  the  federal  government.  Introduced  on  March  31, 
1993.  Referred  to  Senate  Committee  on  Governmental  Affairs. 


EDUCATIONAL  APPLICATIONS 


H.R.  856  (Owens,  M.) 

Educational  Research,  Development, 

and  Dissemination  Excellence  Act 

A  bill  to  improve  education  in  the  United  States  by  promoting  excellence  in 
research,  development,  and  the  dissemination  of  information.  Introduced  on  February 
4, 1993.  Referred  to  House  Committee  on  Education  and  Labor,  Subcommittee  on  Select 
Education  and  Civil  Rights.  Subcommittee  hearings  held  on  May  27,  1993.  Field 
hearings  held  in  New  York,  New  York  on  June  19,  1993. 

TABLE  OF  CONTENTS: 

Title  I:  General  Provisions  Regarding  Office  of  Educational  Research  and 
Improvement 

Title  II:  National  Educational  Research  Policy  and  Priorities  Board 

Title  III:  National  Research  Institutes 

Title  rV:  National  Education  Dissemination  System 

Title  V:  National  Library  of  Education 

Title  VI:  Leadership  For  Educational  Technology 

Title  IV:  -  Amends  GEPA  to  establish  within  OERI  an  Office  of  Reform 
Assistance  and  Dissemination  (Dissemination  Office),  through  which  the  Secretary  shall 
carry  out  a  national  education  dissemination  system  for  school  improvement.  Provides 
for  Dissemination  Office  functions  and  duties,  including:  (1)  identification,  designation, 
and  dissemination  of  exemplary  and  promising  programs  including  certain  training, 
technical,  and  financial  assistance;  (2)  16  Education  Resources  Information 
Clearinghouses;  (3)  dissemination  through  new  technologies;  (4)  an  electronic  network 
for  sources  of  materials  and  research  about  teaching  and  learning  for  improving 
nationwide  education  (SMARTLINE)  to  link  various  educational  research  and  other 
entities;  (5)  an  electronic  networking  and  resource-sharing  for  school  improvement 
program  of  grants  to  State  education  agencies;  (6)  a  networked  system  of  the  ten 
regional  educational  laboratories;  (7)  an  America  2000  Communities  Special  Assistance 
Program,  with  grants  for  Learning  Grant  Institutions  and  District  Education  Agents 
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within  eligible  communities,  development  of  a  comprehensive  America  2000  plan  for 
assuring  educational  success  for  all  students  in  the  community,  and  implementation  of 
a  community-wide  plan  for  educational  improvement;  (8)  the  Teacher  Research 
Dissemination  Network  (regional  partnerships  for  teacher  change  agents);  and  (9)  the 
existing  National  Diffusion  Network  and  its  Developer-Demonstrator  and  State 
Facilitator  projects. 


H.R.  2268  (Brown) 

A  bill  to  facilitate  the  development  of  an  integrated,  nationwide 
telecommunications  system  dedicated  to  instruction  by  guaranteeing  the  acquisition  of 
a  communications  satellite  system  used  solely  for  communications  among  State  and 
local  instructional  institutions  and  agencies  and  instructional  resource  providers. 
Introduced  on  May  26,  1993.  Referred  to  House  Committee  on  Education  and  Labor, 
Subcommittee  on  Elementary,  Secondary  and  Vocational  Education. 


S.  264  (Bingaman) 

Technology  for  the  Classroom  Act  of  1993 

A  bill  to  establish  a  Classrooms  for  the  Future  program.  Directs  the  Secretary 
of  Education  to  award  competitive  grants  to  ehgible  consortia  to  develop  instructional 
programs  and  technology-based  systems  for  complete  courses  or  units  of  study  for  a 
specific  subject  and  grade  level,  if  these  are  commercially  unavailable  locally. 
Introduced  on  January  28,  1993.  Referred  to  Senate  Committee  on  Labor  and  Human 
Resources. 


S.  1040  (Bingaman) 

Technology  for  Education  Act  of  1993 

A  bill  to  support  systemic  improvement  of  education  and  the  development  of  a 
technologically  literate  citizenry  and  internationally  competitive  work  force  by 
establishing  a  comprehensive  system  through  which  appropriate  technology-enhanced 
curriculum,  instruction,  and  administrative  support  resources  and  services,  that  support 
the  National  Education  Goals  and  any  national  education  standards  that  may  be 
developed,  are  provided  to  schools  throughout  the  United  States.  Introduced  on  May 
27,  1993.  Referred  to  Senate  Committee  on  Labor  and  Human  Resources. 

TABLE  OF  CONTENTS: 

Title  I:  Leadership  for  Technology  in  Education 

Title  II:  School  Technology  Support 

Title  III:  Information  Dissemination,  Technology  Training  and  Technical 
Assistance 

Title  IV:  Educational  Technology  Product  Development,  Production,  and 
Distribution 

Title  V:  Educational  Technology  Research,  Development  and  Assessment 
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LIBRARY  APPLICATIONS 


S.  345  (Pell) 

Library  of  Congress  Fund  Act  of  1993 

A  bill  to  authorize  the  Library  of  Congress  to  provide  certain  information 
products  and  services. 

Expresses  the  intent  of  the  Congress  that  core  Library  of  Congress  services  shall 
continue  to  be  provided  at  no  cost. 

Defines:  (1)  "core  library  products  and  services"  as  domestic  interlibrary  loan  and 
information  products  and  services  customarily  provided  by  libraries  to  users  at  no 
charge;  and  (2  )  "specialized  library  products  and  services"  as  specified  customized 
information  products  and  services  that  exceed  core  services,  that  are  not  national 
library  products  and  services,  and  that  are  designed  for  individuals  or  discrete  groups 
of  persons  or  entities. 

Declares  that  this  Act  shall  not  modify  Federal  copyright  law. 

Introduced  on  February  4,  1993.  Referred  to  Senate  Committee  on  Committee 
on  Rules  and  Administration.  Committee  Consideration  and  Mark-up  Session  held  on 
May  20,  1993.  Reported  to  Senate  (amended)  by  Senate  Committee  on  Rules  and 
Administration  on  May  26, 1993,  report  no.:  103-50.  Questions  were  raised  by  Senators 
DeConcini  and  Feinstein  which  resulted  in  a  meeting  of  all  interested  parties  including 
House  and  Senate  Judiciary  Committee  staff  on  June  4.  Discussions  focused  upon 
amendments  proposed  by  the  information  and  publishing  industries. 


S.  626  (Kerrey) 

Electronic  Libraries  Act  of  1993 

A  bill  to  establish  a  system  of  State-based  electronic  libraries.  Provides  for  a 
system  of  State-based  electronic  libraries  which  (1)  allows  delivery  of  or  access  to  a  vast 
array  of  interactive,  multimedia  educational  programs,  research  and  information  data 
and  services,  and  networking  opportunities;  (2)  seeks  to  make  such  materials  available 
to  the  public  via  public  libraries,  electronic  databases  and  telecommunications  systems 
such  as  the  Internet  and  other  networks.  Authorizes  the  National  Science  Foundation, 
the  Department  of  Education,  the  Department  of  Commerce,  the  Defense  Advanced 
Research  Projects  Agency,  and  the  Library  of  Congress  to  make  multi-year  grants  to 
states  to  develop  electronic  libraries.  Introduced  on  March  22, 1993,  Referred  to  Senate 
Committee  on  Commerce,  Science  and  Transportation. 
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HEALTH  SERVICES 


S.  1088  (Harkin) 

Rural  Telemedicine  Development  Act  of  1993 

A  bill  to  amend  the  Public  Health  Service  Act  to  provide  grants 
development  of  rural  telemedicine.  Introduced  on  June  10,  1993.  Referred  t 
Committee  on  Agriculture,  Nutrition,  and  Forestry. 


S.  1143  (Baucus)  .         ,         u  4.- 

A  bill  to  improve  the  delivery  of  health  care  services  in  rural  areas  by  creating 
an  Assistant  Secretary  for  Rural  Health,  to  amend  title  XVIII  of  the  Social  Security  Act 
to  provide  that  medical  assistance  facilities  be  reimbursed  based  on  reasonable  cost,  to 
establish  a  grant  program  for  the  use  of  interactive  telelecommunications  systems,  and 
to  adjust  the  payments  made  for  certain  direct  graduate  medical  education  expenses. 
Introduced  on  June  22,  1993.  Referred  to  Senate  Committee  on  Fmance. 


PRIVACY  AND  INTELLECTUAL  PROPERTY 


H.R.  12  (Hughes)  .  r  ■  4.  e 

A  bill  to  amend  title  17,  United  States  Code,  with  respect  to  infringement  ot 
copyright.  Makes  a  television  broadcast  station  an  infringer  of  copyright  and  subject  to 
civil  remedies  (including  attorney's  fees  and  litigation  costs)  if  such  station  without  the 
express  written  consent  of  the  copyright  owner,  authorizes  the  secondary  transmission 
by  a  cable  system  or  other  multichannel  video  programming  distributor  of  a  copyrighted 
work  broadcast  by  such  station.  Introduced  on  January  5,  1993.  Referred  to  House 
Committee  on  the  Judiciary,  Subcommittee  on  Intellectual  Property  and  Judicial 
Administration. 

H.R.  135  (CoUins,  C.) 

Individual  Privacy  Protection  Act  of  1993  „  ,  . 

A  bill  to  amend  the  privacy  provisions  of  title  5,  United  States  Code,  to  improve 
the  protection  of  individuals  information  and  to  reestablish  a  permanent  Privacy 
Protection  Commission  as  an  independent  entity  in  the  Federal  Government 

Establishes  an  Individual  Privacy  Protection  Board  to:  (1)  study  the  data  banks, 
automated  data  processing  programs,  and  information  systems  of  pubhc  and  private 
organizations  to  determine  standards  and  procedures  in  force  for  the  protection  of 
personal  information;  ...  and  (5)  comment  on  the  implications  for  data  protection  of 
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proposed  Federal,  State,  or  local  statutes,  regulations,  or  procedures.  Provides  penalties 
for  violations  of  privacy  rights. 

Introduced  on  January  5,  1993.  Referred  to  House  Committee  on  Government 
Operations,  Subcommittee  on  Information,  Justice,  Transportation  and  Agriculture. 


H.R.  759  (Boucher) 

Compulsory  License  Clarification  Act  of  1993 
A  bill  to  amend  chapter  1  of  title  17,  United  States  Code,  to  include  in  the  definition 
of  a  cable  system  a  facility  which  makes  secondary  transmissions  by  microwave  or 
certain  other  technologies.  Introduced  on  February  3,  1993.  Referred  to  House 
Committee  on  the  Judiciary,  Subcommittee  on  Intellectual  Property  and  Judicial 
Administration.  Subcommittee  hearings  held  on  March  17,  1993. 


H.R.  897  (Hughes) 
Companion  Measure  -  S.  373 
Copyright  Reform  Act  of  1993 

A  bill  to  amend  title  17,  United  States  Code,  to  modify  certain  recordation  and 
registration  requirements,  to  establish  copyright  arbitration  royalty  panels  to  replace 
the  Copyright  Royalty  Tribunal.  Introduced  on  February  16,  1993.  Referred  to  House 
Committee  on  the  Judiciary,  Subcommittee  on  Intellectual  Property  and  Judicial 
Administration.  Subcommittee  hearings  held  on  March  3  and  4,  1993. 


H.R.  1103  (Hughes) 

A  bill  to  amend  title  17,  United  States  Code,  with  respect  to  secondary 
transmissions  of  superstations  and  network  stations  for  private  home  viewing,  and  with 
respect  to  cable  systems.  Introduced  on  February  24,  1993.  Referred  to  House 
Committee  on  the  Judiciary,  Subcommittee  on  Intellectual  Property  and  Judicial 
Administration.  Subcommittee  hearings  held  on  March  17,  1993. 


H.R.  2576  (Hughes) 

Digital  Performance  Right  in  Sound  Recording  Act  of  1993 
A  bill  to  amend  title  17,  United  States  Code  to  provide  an  exclusive  right  to 
perform  sound  recordings  publicly  by  means  of  digital  transmissions.  Introduced  on 
July  1,  1993.     Referred  to  House  Committee  on  the  Judiciary,  Subcommittee  on 
Intellectual  Property  and  Judicial  Administration. 
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S.  23  (Hatch) 

A  bill  to  amend  title  17,  United  States  Code,  to  clarify  news  reporting  monitoring 
as  a  fair  use  exception  to  the  exclusive  rights  of  a  copyright  owner.  Introduced  on 
January  21,  1993.  Referred  to  Senate  Committee  on  the  Judiciary,  Subcommittee  on 
Patents,  Copyrights  and  Trademarks. 
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APPENDIX  B:  CURRENT  GOVERNMENT-SPONSORED  RESEARCH 
INITIATIVES  IN  DIGITAL  LIBRARIES 

prepared  by 

White  House  Office  of  Science  and  Technology  Pohcy 


Current  Government-sponsored  Research  Initiatives  in  Digital  Libraries 
Prepared  by  the  White  House  Office  of  Science  and  Technology  Policy 
12  July  1993 

While  there  is  congressional  legislation  that  would  expand  government-sponsored  research  in  the 
area  of  digital  libraries  there  are  also  important  research  initiatives  already  underway. 

"Linking  Electronic  Libraries"  is  an  Advanced  Research  Projects  Agency  (ARPA)  project  to  be 
carried  out  over  three  years  by  the  Corporation  for  National  Research  Initiatives  (CNRI)  in  Reston, 
VA.  This  is  a  small  prototype  of  an  electronic  copyright  management  system  that  initially  will 
store  and  disseminate  technical  reports  in  the  field  of  computer  science.  It  will  link  together 
Carnegie-Mellon,  MIT,  Stanford,  Cornell,  the  University  of  California,  ARPA,  and  the  Library 
of  Congress.  It  will  address  such  issues  as  electronic  submission  of  documents  (security  and 
integrity),  storage  in  an  online  repository,  the  digital  transfer  of  rights  and  permissions,  electronic 
payment,  and  user  interfaces.  It  is  an  experiment  designed  to  help  lay  the  groundwork  for  a 
national  electronic  library  of  scientific  and  technical  information.  Although  the  immediate  goal  of 
the  project  is  to  facilitate  access  to  computer  science  research  results,  it  also  has  the  goal  of 
producing  standard  protocols  for  search  and  retrieval,  authentication,  access  control,  bibliographic 
rights  and  permissions  management,  and  image  collection  and  storage.  This  initiative  is  funded  at 
a  level  of  $2.8M  for  the  first  year,  and  $2.5M  for  the  second  and  third  years. 

A  joint  Nationalal  Science  Foundation  (NSF)  initiative  is  more  research-oriented,  focusing  upon 
the  technologies  involved  in  accessing  digital  libraries.  These  include  advanced  software  for 
browsing  and  searching  information  in  a  variety  of  formats,  the  utlization  of  networked  databases 
(meaning  elements  of  them  are  stored  in  different  locations),  and  the  capture  and  categorization  of 
information  in  a  number  of  formats.  This  is  seen  as  an  extension  of  NSF's  support  of  the  concept 
of  a  coUaboratory,  a  distributed  computer  system  with  networked  laboratory  instruments,  tools  that 
enable  a  variety  of  collaborative  activities,  the  resources  for  maintaining,  evolving,  and  assisting 
in  the  use  of  computer-based  facilities,  and  digital  libraries  that  mclude  tools  for  organizing, 
describing,  and  managing  data,  thus  enabling  the  large-scale  sharing  of  data.  Phase  I  of  this 
activity  will  be  announced  in  July,  while  Phase  11  depends  upon  the  outcome  of  congressional 
legislation.  NSF  and  ARPA  wish  to  commit  $5-6M  to  this  for  four  years.  Phase  II  would  include 
participation  from  other  government  agencies  as  well  as  state  goverimients  and  industry. 

Other  research  initiatives  will  come  from  the  Information  Infrastructure  Technology  and 
Applications  Program,  a  component  cf  the  High  Performance  Computing  and  Communications 
(HPCC)  program  which  the  Administration  created  earlier  this  year.  Research  under  this  program 
will  include  services  necessary  for  the  efficient  operation  of  the  National  Information  Infrastructure 
such  as  conventions  and  standards  for  handling  data  in  different  media,  the  development  tools  for 
the  creation  of  services,  and  work  on  intelligent  user  interfaces.  Among  the  areas  in  which  these 
services,  tools,  and  interfaces  will  be  applied  will  be  die  area  of  digital  libraries.  In  Fiscal  Year 
1994  $96M  has  been  requested  for  this  new  addition  to  the  HPCC  program. 
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Hard  copies  of  most  of  the  following  documents  are  available.  Some 

documents  are  available  electronically,  as  stated,  but  might  not  contain  figures 

in  the  ascii  version. 

Email,  fax,  mail  or  phone  your  name,  address,  email  and  phone  number  to: 

Barbara  Lincoln  Brooks,  WAIS  Inc,  1040  Noel  Drive,  Menlo  Park,  CA,  94025, 

phone:  415-327-WAIS,  fax:  415-327-6513,  email:  barbara@wais.com 
************************************************************* 


WAIS    Inc.  Documents 

******************** 

WAIS  Server  and  WAIS  Workstation  Technical  Description,  Release  1.0,  July, 
1993. 

WAIS  Inc.  Releases  New  Network  Publishing  Software,  April  29,  1993.  Press 
Release  announcing  WAIS  Inc.  and  its  products. 

WAIS  Inc.  Price  List,  April  1993. 

WAIS  Inc.  &  SUN  Microsystems  to  market  WAIS  technology,  April  29,  1993. 
Press  Release  announcing  partnership   with   SUN  Microsystems. 

WAIS  Inc.  Question  and  Answer,  March  1993. 

WAIS  Inc.  Company  Story,  March  1993. 

Interfaces  for  Distributed  Systems  of  Information  Servers,  Brewster  Kahle, 
Harry  Morris,  Jonathan  Goldman  (Thinking  Machines     Corporation),  Thomas 
Erickson  (Apple  Computer),  John  Curran  (NSF  Network  Service  Center), 
March,   1992.  (formally  named  "Interfaces  for  Wide  Area  Information 
Servers") 

Available  via  anonymous  ftp: 
/pub/wais-inc-doc/Interf  aces,  txt@ftp.wais.com 
or   WAIS    server  wais-discussion-archives.src 

An  Executive  Information  System  for  Unstructured  Files:  Wide  Area 
Information  Servers,   Brewster  Kahle,  Harry  Morris,  Franklin  Davis,  Kevin 
Teine,  Clare  Hart,  Robin  Palmer.    November,  1991.    Description  of  the  Peat 
Marwick  experiment,  similar  to  the  paper  in  Online  below.     Also  in  Electronic 
Networking,  a  Meckler  publication.  Spring  1992,  pp. 59-68. 

An  Information  System  for  Corporate  Users:  Wide  Area  Information  Servers, 
Brewster  Kahle,  April,  1991.  Thinking  Machines  technical  report  TMC-199. 
Also  in  ONLINE  Magazine  August  1991.  Report  on  the  system  constructed  for 
Peat  Marwick  and  other  corporate  users.     Has  screen  shots  of  WAIStation. 
Available  via  anonymous  ftp: 

/pub/wais-inc-doc/wais-corp. txt@ftp.wais. com    or   WAIS  server 
wais-docs.src 
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WAIS   Bibliography,  Barbara  Lincoln  Brooks,  WAIS  Inc. 
July,  1993.    (This  list). 
Available  via  anonymous  ftp: 

/pub/wais-inc-doc/bibliography.  txt@ftp.wais. com  or 
WAIS    server  wais-discussion-archive.src 

Wide  Area  Information  Servers  Concepts,  Brewster  Kahle, 
November,  1989. 

Early  draft  of  paper  outlining  how  a  Wide  Area  Information  System 
could  grow. 

Available  via  anonymous  ftp: 

/pub/wais-inc-doc/wais-concepts. txt@ftp.wais. com    or  WAIS 
server  wais-docs.src 

Brief  Description  of  WAIS  Sources,  Chris  Christoff,  November  1992. 

A  brief  description  of  the  content  of  many  WAIS  sources  on  the  Internet, 

grouped   into   relevant  categories. 

Available  via  anonymous  ftp: 

/pub/wais-doc/dbdescription. txt@ftp.wais. com    or   WAIS    server  wais- 
discussion-archive.src 


WAIS    articles    &  publications 

lit************************** 

Pointing  finger,  WAIS  at  Internet  addresses,  MacWEEK.  Jeff  Ubois,  May  28, 
1993,  pp42&44. 

WAIS  Offers  Publishing  Products,  Open  Svstems  Tndav.  Paul  Kapustka,  May  10, 
1993,  ppl3. 

Unix  servers  distribute  on-line  information.  Info  World.  Cheryl  Gerber,  May  3, 
1993,  pp6. 

Info  Access  Plan  Promises  Power  to  Fed  Users,  Federal  Computer  Week. 
Jennifer  Jones,  March  29,  1993,  ppl&41. 

A  Web  of  Networks,  an  Abundance  of  Services,  New  York  Times.  John  Markoff, 
February  28,  1993. 

Good-bye,  Dewey  Decimals,  Forbes  Magazine.  David  Churbuck,  February  15, 
1993,  pp204-205. 

Internet  Retrieval  Tools  Go  on  Market,  Network  World.  Ellen  Messmer, 
February  15,  1993,  pp29  &  77. 

Federal  Information  on  the  Internet,  Anna  Keller,  Library  of  Congress, 
February,  1993. 

Internet  of  the  Future  may  be  a  One-Stop  Information  Shop,  MacWEEK.  Margie 
Wylie,  January  25,  1993,  pp22&24. 

Index  Everything,  Share  It  Companywide  with  WAIS.  MacWEEK.  Daniel  P.  Dern, 
October  26,  1992,  pp24-25. 
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Help  is  on  the  WAIS.  American   Tjhraries.  M.  Lukanuski,  October  1992,  pp742- 
744. 

Feature  article  with  some  pros  and  cons  of  the  WAIS  protocol  from  the  library 
community  point  of  view. 

Information  -  the  Commodity  of  the  Future,  Merit/NSFNRT  Link  Letter 
Newsletter.  Merit/NSFNET  Information  Services,  September/October  1992. 
Follow-up  to  above  article,  explaining  how  Merit/NSFNET  is  utilizing  the 
different    information    services  available. 
Available  via  anonymous  ftp: 
/pub/wais-doc/linkletter2@f  tp.wais.com 

Identifying   and  Describing   Federal  Information   Inventory/Locator  Systems: 
Design  for  Networked-Based  Locators,  Charles  R.  McClure,  Joe  Ryan,  and 
William  E.  Moen,  School  of  Information  Studies,  Syracuse  University,  August 
25,  1992,  volume  1. 

A  Comparison  of  Internet  Resource  Discovery  Approaches,  M.  Schwartz,  A. 
Emtage,  B.  Kahle,  B.C.  Neuman,  August  1992.  Paper  to  appear  in  Computing 
Systems  5(4),  1992. 

In-Depth   overview   and   comparison  of  current  Internet  information  systems. 
Postscript  copy  available  via  anonymous  ftp: 
/pub/ w  ai  s -d  o  c/re  so  urce.compar@f  tp.wais.com 

WAIS:  The  Wide  Area  Information  Server  or  Anonymous  What???,  Peter 
Marshall,  June  18,  1992. 

Describes  and  details  the  implementation  of  WAIS  at  the  University  of  Western 
Ontario. 

Available   via  anonymous  ftp: 

/pub/w  ais-doc/U  WO- wai  s-paper.ps@ftp.wais. com 

Personal  Computing:   Collective  Dynabases,  Communications  of  the  ACM.  Larry 

Press,  June  1992,  pp26-32. 

Overview  of  WAIS  and  commercial  projects. 

WAIS:  A  New  Development  in  Information  Services,  MIT  I/S  Newsletter.  T. 
MacRae  and  S.  Jones,  June,  1992. 

Overview  of  WAIS  by  the  Network  Services  and  Publication  Services  at  MIT. 

Available  via  anonymous  ftp: 

/pub/w  ais -doc/MIT. IS. news  @f  tp.wais.com 

WAIS:    Wide  Area  Information  Servers,  Information   Intelligence  Inc..  George  S. 

Machovec,  March  1992,  ppl-5. 

Overview  of  WAIS  from  a  librarian  perspective. 

Available  via  anonymous  ftp: 

/pub/w  ais-doc/lib. perspective@ftp.wais.  com 

WAIS  -  Making  it  Easier  to  Access  Internet  Resources.  Merit/NSFNET  Link 

Letter  newsletter.  Brewster  Kahle,  March/April  1992.     Overview  of  WAIS. 

(reprinted  from  CERFnet  News,  Volume  3  Number  6) 

Available  via  anonymous  ftp: 

/pub/w  ais-doc/linkletter@f  tp.wais.com 

WAIS:  Is  It  the  Lotus  1-2-3  of  the  Internet?,  Communications    Week.  Carl 
Malamud,  March  16,  1992,  ppl7. 
Brief  article  of  WAIS  on  the  Internet. 
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mere  There's  a  Will,  There's  a  WAIS,  Digit?)!  Mg(jia  -  A  .Seybold  RgpoU.  Denise 
Caruso,  February  17,  1992,  pp5-6. 

Article  touching  on  several  issues  of  wais,  such  as  protocol, 
client/server  relationship,  "for  pay"  servers,  and  legal  issues. 

The  Reading  I?rrm  Piritnl  -  ^  '^--^^^'^  ^^-""■•t-  Brewster  Kahle,  February 

Ellsay^^on  The'^  controversy  between  private  ownership  of  information  and 
public  access  to  information. 

The  Promise  of  the  WAIS  Protocol,  UNIX  Tpdavl.  Jason  Levitt, 

December  9,  1991,  pp44,  47-48. 

Article  describing   the  freeware  release. 

The  Global  Village  Starts  with  WAIS,  Tomaso  Poggio,  December  1991, 
Overview  of  WAIS  in  Italian. 

Network  to  Unite  Data  Bases,  Snn  ,Tnse  Mf.rpvry  Ngws.  John 
Markoff,  July  21,  1991,  pplF. 

Rewriting  of  the  "For  the  PC  User,  Vast  Libraries,"  New  York  Times 
article  with  emphasis  on  Apple  component. 

For  Shakespeare,  Just  Log  On,  New  York  Times.  John  Markoff, 
July  3,  1991,  ppCl. 

Overview   of  WAIS  Internet  experiment. 

Browsing    Through   T.mhyt^,  Rvte.   Magazine.  Richard  Stein,  May 
1991,  ppl57-164. 

Readable  article  on  what  a  large  WAIS  system  looks  like. 

WAIS  Promises  Easy  Text  Retrieval,  MapWEEK,  Henry  Norr,  May 
14,  1991,  pg22. 

Report  on  the  Peat  Marwick  WAIS  system. 

Release   /  n  Re.lease  1.0.  Esther  Dyson,  April  1991,  entire  issue. 

In-depth  article  on  commercial  systems  and  protocols,  featuring 

WAIS     (Hardcopy  copies  available  from:  EDventure  Holdings,  375  Park  Ave., 

New  York,  NY  10152;  (212)  758-3434) 

Anonymous  FTP: 

/pub/wais-doc/release  l.0@ftp.wais. com 
WAIS    server:  wais-discussion-archives.src 

Designing  a  Desktop  Information  System:  Observations  and  Issues, 

Thomas  Erickson  &  Gitta  Salomon,  Apple  Computer.    Human  F/ctors  in 

Computing  Systems,  rm  -Ql  Conference  Proceedings    (pp49-54)  April  1991,  New 

Orleans.  New  York:  ACM,  1991. 

Early  paper  on  the  Apple  interface  for  WAIS. 

An  Analysis  of  the  Effects  of  Data  Corruption  on  Text  Retrieval 
Performance    S.  Smith,  C.  Stanfill,  December  1988.    Thinking  Machines 
Corporation  technical  report  TMC-68. 
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Z39.50    &   WAIS  Protocol 

Jj!!)'******************* 


Z39. 50-1 988:  Information  Retrieval  Service  Definition  and  Protocol 
Specification  for  Library  Applications.     National  Information  Standards 
Organization  (Z39),  P.O.  Box  1056,  Bethesda,  MD  20817.    (301)  975-2814. 
Available  from  Document  Center,  Belmont,  CA.    Telephone  415-591-7600. 

Z39.50-1992  Version  2  Final  Text,  July  1992.    Working  copy  of  the  latest  Z39.50 
implementors  group.     National  Information  Standards  Organization  (Z39),  P.O. 
Box  1056,  Bethesda,  MD  20817.    (301)  975-2814.    Available  from  Transitions,  908- 
932-2280. 

Z39. 50-1 991  Version  2,  May  1991.  Electronic  version  of  the  working  copy  of  the 
Z39.50  implementors  group.     Anonymous  FTP: 

/pub/protocol/z3950-v2d3. txt@ftp.wais.com   or  WAIS   server  wais-docs. 

Z39. 50-1 992  Version  3,  Draft  7,  June  1993.    Electronic  version  of  the  working 
copy  of  the  Z39.50  implementors  group.    Anonymous  FTP: 
/pub/pro  tocol/z3950- v3d7.txt@ftp.wais. com. 

The  Z39.50  Information  Retrieval  Protocol:  An  Overview  and  Status  Report, 
Clifford  Lynch,  Computer  Communication  Review  ACM  SIGCOMM  Introduction 
to  the  protocol  of  WAIS. 

The  Z39.50  Protocol  in  Plain  English,  Clifford  Lynch.    Fall  1992. 
Available  via  anonymous  ftp: 
/pub/prolocol/plain.english@f  tp.wais.com 


Electronic  Services 

sic***************** 


wais-discussion@think.com:  Bi-weekly  digest  of  mail  from  users  and 
developers   on  Electronic  Publishing  (includes   all   wais-interest  postings). 
Requests    to  wais-discussion-request@think.com. 
Anonymous  FTP  access  to  archives: 

/pub/mail-archives/wais-discussion/issue-*  @  wais.com 

wais-talk@think.com:  interactive  list  of  developers.     A  couple 
notes  a  day.     Requests  to  wais-talk-request@think.com.  Archives 
are  available  on  WAIS   server  wais-talk-archives.src 

comp.infosystems.wais:  a  netnews  discussion  group  on  WAIS  issues.  All 
postings  to  wais-discussion@Think.COM  go  to  that  group  as  well. 

Z3950iw:  Z39.50  Implementors  list  for  low  level  discussions  of  protocol  details. 
Requests  to  LISTSERV@NERVM.NERDC.UFL.EDU 

Freeware  Servers: 

NeXT:  /pub/freeware/next/*  @f  tp.wais.com 

RS6000:  /pub/freeware/rs6000/*@f  tp.wais.com 

SGI:  /pub/freeware/sgi/*  @f  tp.wais.com 

Source  Code:  /pub/freeware/unix-src/wais-8-*. tar. Z@ftp.wais. com 

SUN:  /pub/freeware/sun/*  @  ftp.  wais.com 
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Freeware  Clients: 
Mac: 

WINDOWS: 

XWAIS: 
DOS: 

EMail: 


0S2: 


Telnet  access: 
(uses  SWAIS) 


or 


SWATS: 
NeXT: 

GWAIS: 

(Gnu  Emacs) 

Openlook: 

VMS: 

Sun  View: 

IBM  Mainframe; 


by  Harry  Morris,  WAIS  Inc. 
/pub/freeware/mac/*  @ftp.wais. com 

by  Tim  Gauslin.  USGS 

/pub/freeware/windows/wnwais*. zip@ftp.wais. com 

by  Kevin  Gamiel,  MCNC  CNIDR 
/pub/NIDR.tools/wais/pc/windows@  ftp. cnidr.org 

by  Jonathan  Goldman,  Thinking  Machines  Corporation 
/pub/freeware/unix-src/wais-8-*. tar. Z@ftp.wais. com 

by  Jim  Fullton,  University  of  North  Carolina 
/pub/wais/DOS/*@  sunsite.unc.edu  or 
/pub/tcpip/pcwais. zip@hilbert.wharton.upenn.edu 

by  Jonathan  Goldman,  Thinking  Machines  Corporation 
send   message  to  waismail@quake.think.com, 
"search  <source-name>   {keywords}"  or 
"retrieve  DOCID"  (DOCID  as  returned  by  a  search) 

by  Kevin  Oliveau  of  WAIS  Inc.,  Julie  Mills  and  the 
Library  of  Congress 
/pub/freeware/os2/*@  ftp.wais.com 

telnet   quake.think.com   login  wais, 
password  user@host 

by  John  Curran,  BBN 

/pub/freeware/unix-src/wais-8-*.  tar. Z@ftp.wais. com 

by  Paul  Burchard,  University  of  Utah 
/pub/freeware/next/*  @  ftp.wais.com 

by  Jonathan  Goldman,  Thinking  Machines  Corporation 
/pub/freeware/unix-src/wais-8-*. tar. Z@ftp.wais. com 

by  Simon  Spero,  University  of  North  Carolina 
/pub/freeware/open-look/*  @f  tp.wais.com 

by  Jim  Fullton,  University  of  North  Carolina 
/pub/wais/vms/*@  sunsite.unc.edu 


/pub/wai  s/sun  view/*  @  sunsite.unc.edu 
by  Tim  Gauslin,  USGS 

/pub/freeware/ibm-mvs/*@f  tp.wais.com 


WAIS  Videos 

Special  Interest  Group  on  Wide  Area  Information  Servers:  Conference  Held 
March  19   1993,  Open-File  Report  93-252,  United  States  Geological  Survey  video 
on  WAIS,'  VHS  videotape  $20.    Send  orders  to  Book  and  Open-File  Report  Sales, 
USGS,  Federal  Center,  Box  25286,  MS  306,  Denver,  Colorado,  80225. 
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Wide  Area  Information  Servers  Class:  Indexer  and  Server,  Open-File  Report  93- 
253,  United  States  Geological  Survey  training  video  on  WAIS,  VHS  videotape,  2- 
tape  set  $40.    Send  orders  to  Book  and  Open-File  Report  Sales,  USGS,  Federal 
Center,  Box  25286,  MS  306,  Denver,  Colorado,  80225. 

Macintosh    Demostration    Screen-Movie.  Steve  Cisler  put  together  a  short 
screen-recorder  movie  for  seeing  some  of  what  WAIStation  (Mac  client)  does. 
Available  via  anonymous  FTP: 

/pub/wais-doc/WAIS  tation-Canned-Demo.sit.hqx@ftp.wais. com 


Internet  Information 

The  Whole  Internet:  User's  Guide  &  Catalog,  Ed  Krol,  O'Reilly  &  Associates,  Inc, 
1992.  (Chapter  12  entitled  "Searching  indexed  databases:  WAIS") 

Exploring  the  Internet:  A  Technical  Travelogue,  Carl  Malamud,  Prentice  Hall, 
1992. 

Internet  access  providers  in  the  United  States,    The  general  types  of  services 
they  provide,  and  how  to  contact  them.  From  Chapter  4  of  the  book,  "Internet: 
Getting  Started".     For  more  information  about  "Internet  :  Getting  Started", 
contact  SRI  International  at  415-859-3695,  nisc@nisc.sri.com. 

Internet  access  providers  outside  the  United  States.    From  Chapter  7  of  the 
book,  "Internet:  Getting  Started".     For  more  information  about  "Internet  : 
Getting  Started",  contact  SRI  International  at  415-859-3695,  nisc@nisc.sri.com. 

Public  Dialup  Internet  Access  List  (PDIAL),  February  1993.    A  list  of  public 
access  service  providers  offering  dialup  access  to  outgoing  Internet 
connections  such  as  FTP  and  telnet.     Available  by  sending  email  to  "info-deli- 
server®  netcom.com",  with  the  message  subject  "send  PDIAL". 

Other    services    that    can    gateway    to    WAIS  services: 
**************************************** 

Gopher 

by  the  University  of  Minnesota 
Via  anonymous  ftp: 

/pub/gopher  @  boombox.micro.unm.edu 

World  Wide  Web 
by  Tim  Berners-Lee 
Via  anonymous  ftp: 
/pub/www/src@info.cern.ch 


Freeware  Information 

********************* 

For  information  on  WAIS  freeware  or  the  Clearinghouse  of  Networked 
Information  Discovery  and  Retrieval  (CNIDR),  contact  Jane  Smith  at 
jane.smith@cnidr.org  or  919-248-9213.     The  director  of  the  freeware  is  George 
Brett  at  ghb@jazz.concert.net  or  919-962-1000. 
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APPENDIX  D:  BRIEF  DESCRIPTION  OF  WAIS  SOURCES 

by 

Chris  Christoff 


Brief  Description  of  WAIS  Sources 

A  brief  description  of  the  content 
of  many  WAIS  sources  on  the 
Internet,  grouped  into  relevant  categories. 


Chris  Christoff 
Bond  University 
chrisc@bu.oz.au 


Aeronautics  1 

Archaeology  1 

Astronomy  1 

Biology  1 

Chemical  Engineering  2 

Computer  Platforms  2 

Macintosh  2 

PC  3 

Sun  Microsystems  3 

Unix  3 

Computer   Science  ..3 

Languages  4 

Computer  Software  4 

Connection  Machine  (CM)  information  .5 

CWIS  5 

Gopher  5 

WAIS  5 

Education  6 

Engineering  6 

Environment  6 

Finance  6 

Graphics  .7 

Humanities  7 

Journalism  8 

Religion  8 

Information  Sources  8 

Law  9 

Libraries  and  Catalogues  9 

Mathematics  10 

Miscellaneous  10 

Multimedia  10 

Networks  10 

Documentation  and  Standards  11 

Security  11 

Using  tiie  Internet  11 

Phonebooks,  Mail  and  Computer  Lists   1 1 

Recreation  12 

Music  ..12 

Food  ;  13 

Robotics  13 

Research  (Miscellaneous)  13 

Science  (general)  13 

US  Government  Departments   13 


External  Sources 

These  are  sources  external  to  Bond  University,  i.e.  out  there  on  the  IntemeL  A 
grouping  of  the  sources  into  subject  categories  has  been  attempted. 

Note:  sources  marked  (*)  have  limited  hours  of  availability. 

sources  marked  (#)  support  boolean,  partial  word  and  literal  searches 
This  document  is  an  extension  and  reorganisation  of  a  great  document  from  the 
University  of  Melbourne,  Australia. 

Aeronautics 

aeronautics. src   Contents  of  the  aeronautics  mailing  list  ftp  area 

from  University  of  Texas,  coversing  many  topics 
of  aeronautics,  flying  and  aircraft 

Archaeology 

archaeological_computing,src  ....  Bibliography  of  papers  on  computing  as  applied  to 

archaeology  (BibTeX  format) 

Astronomy 

astro-images-fits.src    Astronomical  images  in  FITS  format 

astro-images-gif.src    Astronomical  images  in  GIF  format 

Biology 

alt.drugs.src    Oregon  State  University  altdrugs  newsgroup 

archive 

Arabidopsis_BioSci.src    Index  of  arabidopsis  conference,  and  bionet 

newsgroup  and  mailing  list  messages 

Arabidopsis_thaliana_Genome.src  AAtDB  database  including  genetic  maps,  strains, 

clones,  colleague  contacts  and  more 

biology-compounds.src    Databaseof  metabolic  intermediate  compounds 

biology-joumal-contents.src          Periodical  references  to  journals  in  the  field  of 

molecular  biology 

bionic-ai-researchers.src    Database  of  molecular  biologists  working  in  the 

field  of  AI 

bionic-algorithms.src    Literature  references  to  molecular  biology 

algorithms 

bionic-arabidopsis.src    Database  of  arabidopsis  research  workers 

bionic-biosci-docs.src    Files  from  the  BioSci  network 

bionic-databases-limb.src    List  of  databases  available  to  molecular  biologists 

bionic-directory-of-servers   Indexes  ^bionic'  sources  in  Finland 

bionic-embl-software.src    A  list  of  software  available  from  EMBL 

bionic-enzclass.src  use  in  conjunction  with  the  enzyme  source 

bionic-enzyme.src    Amos  Bairoch's  enzyme  database 

bionic-genbank-software.src  .....  A  list  of  software  forOenbank  database 

bionic-info-gcg-archive.src           Log  files  from  the  INFO-GCG  listserver 

bionic-joumal-contents.src  ........  Literature  references  from  molecular  biology 

journals 

bionic-networking.src   Texts  to  explain  networking  to  biologists 

bionic-sequence-bibliography  ....  Sequence  analysis  literature  reference  database 
biosci.src    Archive  of  articles  posted  to  BIOSCI  mailing  lists 

and  newsgroups  since  1989 
Caenorhabditis_elegans_Genome.src  A  database  of  C.  elegans  information  (e.g.  DNA 

sequence,  genetic  map  etc) 
cldb.src    Animal  cell  lines  available  in  European  research 

labs 

EC-enzyme.src    EC  enzyme  database 

lUBio-arcdocs.src   Index  of  abstracts,  help  files  and  information  on 

Indiana  University  Archive  of  Biology  Software 


and  Data 

lUBio-fly-address.src    Index  of  addresses  of  Drosphila  researchers  (#) 

lUBio-fly-amero.src    Database  of  polytene  chromosone  sites  that  bind 

antibodies  to  Drosphila  proteins  (#) 
lUBio-fly-clones.src   Index  to  sources  of  information  on  Drosphila 

melanogaster  genetics  (#) 

lUBio-fly-din.src   Index  of  electronic  Drosphila  newsletter  (#) 

lUBio-flybase.src    Index  of  Drosphila  database 'Flybase' (#) 

lUBio-flystock-bg.src    Index  of  Drosphila  fruit  fly  stocks  at  stockcentre  at 

Bowling  Green,  USA  (#) 
lUBio-flystock-bl.src   Index  of  Drosphila  fruit  fly  stocks  at  stockcentre  at 

Bloomington,  USA  (#) 
lUBio-flystock-um.src    Index  of  Drosphila  fruit  fly  stocks  at  stockcentre  at 

Umea,  Sweden  (#) 

lUBio-gbnew.src    Index  of  updates  of  gene  sequences  since  the  last 

Genbank  update  (#) 

lUBio-genbank.src    Index  of  Genbank  databank  gene  sequences  (#) 

lUBio-netnews.src   Index  of  articles  from  Bio  newsgroups  and  mailing 

lists  (#) 

Molecular-biology.src   Annotation  of  the  GenBank  DNA  sequence 

database 

online-mendelian-inheritance-in-man.src  Catalogues  of  Autosomal  Dominant, 

Autosomal  Recessive  and  X-Linked  Phenotypes 
NIH-Guide.src    US  National  Institutes  of  Health  Guide  to  Grants 

and  Programs  for  biomedical  researchers 

prosite.src    A  dictionary  of  protein  sites  and  patterns 

rebase-enzymes.src    The  REBASE  restriction  database  of  Richard 

Roberts  Cold  Spring  Harbor  Laboratory 
RPMS-pathology.src   Royal  Postgraduate  Medical  School  histo- 

pathological  images  (gif)  and  documents  on 

mammalian  endocrine  tissues 


Chemical  Engineering 

chem-eng-current-contents.src  ...  Chemical  engineering  bibliography 
fcomputer  Platforms 


comp.admin.src    Previous  10  days  news  in  newsgroup  comp.admin 

comp.sys.src    Previous  10  days  news  in  comp.sys* 

scsi-2.src    Small  Computer  System  Interface-2  draft  ANSI 

standard 


Index  of  Fred  Fish's  disk  #1  of  Amiga  software 
Archive  of  mailing  list  on  Amiga  wide  area 
networking 

Index  of  readme,  index  and  contents  files  for 
Amigasoftwareonarchie.au 

Macintosh 

?irchie.au-mac-readmes.src   Index  of  the  Readme,  Index  and  Contents  files  for 

the  Mac  archive  on  ^archie.au' 

info-mac. src   Archive  of  the  info.raac  discussion  forum 

mac.FAQ.src   Archive  of  mac.alLFAQ  news  group 

macintosh-tidbits.src   Tidbits  electronic  magazine  for  the  Mac  (*) 

Next 

NeXT.FAQ.src   Information  about  NeXT  computer  systems 

NeXT-Managers.src    Archive  of  postings  from  mailing  list  for 

administrators  of  NeXT  systems 


Amiga 

amiga_fish_contents.src  ... 
amiga-slip.src   

archie.au-amiga-readmes.src 


PC 

archie.au-pc-readmes.src   Index  of  Readme,  Index  and  Contents  files  for 

amiga  archives  on  'archie.au' 
cica-win3.src    Index  to  CICA  (Centre  for  Innovative  Computing 

Applications)  Windows  3  archive 
ibm.pc.FAQ.src    Information  about  IBM  PC  systems 

Sun  Microsystems 

alt.sys.sun.src    Archived  news  articles  from  alt.sys.sun  newsgroup 

sun-admin. src    Archive  of  comp.sys.sun  newsgroup 

sun-announce.src    Archive  of  comp.sys.sun  announce  newsgroup 

sun-apps.src    Archive  of  comp.sys.sun.apps  newsgroup 

sun-fixes.src    Sun  Microsystem's  bug  patches  README  files 

sun-hardware,  src   Archive  of  comp.sys.sun.hardware  newsgroup 

sun-manager-summary.src   Index  of  sun-managers  mailing  list  summaries  (#) 

sun-misc.src    Archive  of  comp.sys.sun.misc  newsgroup 

sun-openlook.src   Archive  of  comp.windows.open-look  newsgroup 

SunSITE-ftp.src   Index  of  all  of  the  index  and  README  files  in 

SunSITE  ftp  archive  (which  contains  s/w,  pictures, 

sounds  and  documents  for  Sun  computers 
sun-spots. src    Archive  of  the  Sun-Spots  digest  and  Sun  Managers 

mailing  list  that  discuss  computers  from  Sun 

Microsystems 

sun-wanted.src    Archive  of  comp.sys.sun. wanted  newsgroup 

sunflash-  1990.src    1990  issues  of  The  Florida  Sunflash  journal 

sunflash-1991.src    1991  issues  of  The  Florida  Sunflash  journal 

sunflash- 1992.src    1992  issues  of  The  Florida  Sunflash  journal 

Supercomputers 

San_Diego_Super_Computer_Center_Docs.src ..  Some  of  the  copyrighted  documents 

available  from  SDSC's  online  system  (userguides, 
Cray  languages,  math  libraries  etc) 

Unix 

comp.windows.x.motif.src   Archive  of  newsgroup  on  X-windows  and  motif 

posix.  1003.2.src    Portable  Operating  System  for  Unix  draft  standards 

UC-motif-FAQ.src   Frequently  Asked  Questions  on  motif  from  the 

comp.windows..x.motif  newsgroup 

unix.FAQ.src    Information  about  UNIX 

unix-manual.src    Manual  pages  for  UNIX 

Computer  Science 

bib-dmi-ens-fr.src   Bibliography  of  books  and  conference  proceedings 

on  maths  and  comp  sci  (French  keywords) 

bibs-zenon-inria-fr.src   Bibliograpy  of  books,  conference  proceedings, 

theses,  periodicals,  research  reports  on  sotware 
engineering  and  mathematics  (french  keywords) 

cacm.src    CommimicationS  of  the  ACM  April  '89  -  April  '92 

comp.archives.src   Archive  of  comp.archives  newsgroup,  giving  an 

index  of  ftp  accessible  files 

Comp-Sci-Tech-Reports.src          Computer  science  technical  reports,  abstracts  and 

papers  from  various  FTP  sites 

cscwbib.src    Bibliography  of  computer  supported  cooperative 

work  (refer  format) 

cs-journal-titles.src  Article  title  and  authors  from  over  600  computing 

journals,  conference  proceedings,  books  and 
seminars 

cs-techreport-abstracts.src    Titles  and  authors  of  some  5100  comp  sci 

techreports,  pre/reprints,  notes  and  papers 


cs-techreport-archives.src   List  of  sites  that  archive  compsci  reports 

cs-techreports.src    Index  of  2000  comp  sci  technical  reports  from  ftp 

sites 

Func-Prog_Abstracts.src   Small  collection  of  computer  science  tech  reports, 

abstracts  and  papers  on  functional  programming 
lolita-dator.src   Bibliography  of  a  selection  of  computer  related 

literature  from  Lund  University,  Sweden 
Ip-bibtex-zenon-inria-fr.src           Proceedings   from   conferences   on  Logic 

Programming  (bibtex  format) 
machos-bibtex-zenon-inria-fr         Bibliography  on  MAGH  operating  system  (bibtex 

format) 

merit-archive-mac.src   Index  of  some  2000  Mac  programs  available  via  ftp 

from  mac.archive.umich.edu 
meval-bibtex-zenon-inria-fr           Bibliography  of  MEVAL  project  -  network 

queueing  theory  and  modelling  (french  and  english, 

bibtex  format) 

MIT-algorithms-bug.src   Bug  lists  for  the  book  'Introduction  to  Algorithms' 

MIT-algorithms-exercise.src   Exercises  to  be  used  with  'Introduction  to 

Algorithms' 

MIT-algorithms-suggest.src           Suggestions  submitted  by  readers  of  "Introduction 

to  Algorithms' 

monashuni-papers.src    List  of  articles  from  many  computing  journals 

monashuni-techreports.src    A  list  of  archive  sites  that  maintain  Computer 

Science  technical  reports 
neuroprose.src  Index  to  papers  on  neural  networks  on 

archive.cis.ohio-state.edu 

nren-bill.src    U.S.  High  Performance  Computing  Act  1991 

open_systems_calendar.src           Calendar  of  upcoming  events  related  to  Open 

Systems  computing 

ra-mime-zenon-inria-fr.src    Comp  Sci  and  engineering  reports  from  National 

Institute  of  Research  in  Computer  Science  and 

Control  (mime  format) 
ra-zenon-inria-fr.src   1990  activity  reports  from  National  Institute  of 

Research  in  Computer  Science  and  Control  (french 

keywords,  DVI  format) 
risks-digestsrc    Collection  of  the  RISKS  digest  which  discusses  the 

risks  involved  with  using  computers 
SDSC_Docs.src   San  Diego  (State  Uni)  Supercomputer  Centre 

information  and  documentation 

software-eng.src    Archive  of  newsgroup  comp.software-eng 

tmc-technical-reports.src    Sampling  of  reports  from  Thinking  Machines  Corp 

UNTComputerDoc.src    Technical  documents  written  by  Academic 

Computing  Services  at  the  University  of  North 

Texas 

Languages 

comp.lang.perl.src    Index  of  news  group  comp. lang. perl  (perl 

computer  language) 

comp.lang.tcl.src  Archive  of  comp.lang.tcl  newsgroup  (tcl  computer 

language) 

tcl-talk.src    Think  Class  Library  discussion  list  (Think  C/Pascal 

for  Mac) 

Computer  Software 

ASK-SISY-Software-Information.src  Information  on  software  for  different  fields  of 

interest  to  universities 

comp.binaries.src   Archive  for  comp.binaries  newsgroup  (executable 

code  for  a  variety  of  operating  systems) 


comp.db.src    Previous    10    days    news    in  newsgroup 

comp.databases 

comp.emacs.src   Previous  10  days  news  in  newsgroup  comp.emacs 

comp.sources.src    Previous  10  days  news  in  comp.sources* 

comp.windows.ms.src    Archive  of  comp.windows.ms  newsgroup  (MS 

windows,  programming,  applications  etc) 

cosmic-abstracts.src    Abstracts  of  programs  in  the  COSMIC  inventory 

cosmic-programs.src    Sample  database  of  programs  developed  for  the  US 

Government 

fj.sources.src    Index  to  Japanese  software  archive  on  utsun.s.u- 

tokyo.ac.jp 

hyperbole-ml.src    Archive  of  Hyperbole  mailing  list  (information 

manager  built  on  Emacs) 
info-afs.src   Index  of  archives  of  mailing  list  on  the  Andrew  Fiel 

System 

jargon.src    Collection  of  slang  terms  used  by  various 

subcultures  of  computer  hackers  and  network 
phreakers 

MacPsych.src   Archive   of  discussion   on   software  for 

psychologists 

sorrel-ada- archives. src    Software  Reuse  Repository  Labs  Ada  Sources 

wuarchive.src   The  directory  listing  of  the  software  archive 

wuarchive.wustl.edu 

Connection  Machine  (CM)  information 

Applications-Navigator.src           A  description  of  some  300  CM  applications  from 

many  fields  (e.g.  fluid  flow  to  AI) 

CM-applications.src   Applications  that  run  on  Thinking  Machines'  CM 

series  of  computers 

CM-fortran-manual.src    Documentation  for  CM  Fortran  (*) 

CM-images.src    Sample  images  from  calculations  done  on  CM 

computers 

CM-paris-manual.src   PAMS  manual  for  programming  the  CM  C") 

CM-star-lisp-docs.src  ..............  TMC  *Lisp  Reference  Manual  (*) 

CM-tech-summary.src   TMC  Technical  Summary  of  the  CM  System  (*) 

cm-zenon-inria-fr.src    Administrative  information  for  Connection  Machine 

at  National  Institute  of  Research  in  Computer 
Science  and  Control  (D VI  format) 
CMFS-documentation.src   CM  File  Server  Reference  Manual  (*) 

CWIS 

bit.listserv.cwis-l.src    Archive  of  Campus  Wide  Information  Systems 

(CWIS)listserver 

Gopher 

alt.gopher.src    Archive  of  the  altgopher  newsgroup 

WAIS 

alt.wais.src   Articles  from  the  alt. wais  newsgroup 

au-directory-of-servers.src  ........  Backup  copy  of  the  directory-of-servers  at 

Thinking  Machmes  Corp. 
cicnet-directory-of-servers.src  ....  Directory  of  servers  at  CICnet 
cicnet-wais-servers.src    WAIS  servers  run  at  the  CICnet  networld 

information 

Connection-Machine.src    Databases  on  Connection  Machine  WAIS  sever 

e.g.  factbook,  biology,  bible,  NIH  guide  (*) 

directory-of-servers.src    Directory  of  servers  at  quake.think.com 

directory-zenon-inria-fr.src           WAIS  sources,  at  National  Institute  of  Research  in 

Computer  Science  and  Control  (France) 


INFO.src   Same  as  directory-of-servers.src 

lUBio-INFO.src    Several  biology  WAIS  sources  (#) 

SDSU-directory-of-servers.src  ...  San  Diego  State  University  directory  of  WAIS 

servers 

unc-directory-of-servers.src           University  of  North  Carolina  directory  of  WAIS 

servers 

wais-discussion-archives.src          Electronic  discussion  forum  about  WAIS 

wais-docs.src    WAIS  software  distribution  documentation 

wais-talk-archives.src    Informal  discussions  about  WAIS 

Education 

canada-asia-info.src   Curriculum  Resources  Database,  developed  by  the 

Asia  Pacific  Foundation  of  Canada 
catalyst.src    Articles  from  publication  Catalyst  on  community 

services  and  continuing  education 
educom.src  Documents,  summaries,  calendars  etc  from 

Educom  (association  for  higher  education  IT 

managers) 

ERIC-archive.src   ERIC  (Educational  Resources  Information  Centre) 

digests 

eric-digest,  src    Short  reports  on  topics  of  prime  current  interest  in 

education 

jte.src    Articlesfromthe  Journal  of  Technology  Education 

kidsnet.src    Archive  of  mailing  list  for  international  computer 

network  for  children  and  their  teachers 
k-12-software-reviews.src    Software  and  Courseware  Online  Review,  contams 

reviews  of  educational  software 
livestock.src    Educational  material  for  livestock  production  and 

management 

Engineering 

ijaema_a.src  Abstracts  of  papers  from  the  International  Journal 

of  Analytical  and  Experimental  Modal  Analysis 
software-eng.src    Archive  of  newsgroup  comp.software-eng 

Environment 

DOE_Climate_Data.src    Index  to  US  Dept  of  Energy  world  study  reports 

covering  subjects  from  air  pollution  to 
environmentol  policies  to  geological  structure 

environment-newsgroups.src   Archive  of  number  of  environmental  newsgroups 

Global_Change_Data_Directory.src  ..  Index  of  global  cUmatic  change  study  reports 

great-lakes-factsheets.src   Factsheets  on  environmental  issues  and  subjects 

relevant  to  the  US  Great  Lakes/St  Lawrence  river 

lolita-miljo.src   Abstracts  of  environment  related  literature  from 

Lund  University,  Sweden 

midwest-weather.src   Weather  forecasts  for  US  midwestern  states, 

updated  hourly 

miljodatabas.src    Index  of  environmental  research  projects  from 

Lund  University,  Sweden 
NOAA_National_Environmental_Referral_Service.src  ..  US  National  Oceanic  & 

Atmospheric  Admin  environmental  tests  and 

available  data  (sun,  atmosphere,  earth  and  oceans) 
USGS_Earth_Science_Data_Directory.src  ..  US  Geological  Survey  directory  of  earth 

sciencs  and  natural  resource  database 


Finance 

agricultural-market-news.src 


Agricultural  commodity  market  reports  compiled  by 
the  US  Department  of  Agriculture 


EIA-Petroleum-Supply-Monthly.src  Tables  and  figures  from  US  Dept  of  Energy, 

Energy  Information  Agency  on  disposition  of 
petroleum  products  (postscript  format) 

nafta.src    Full  text  of  the  North  American  Free  Trade 

Agreement 

usda-rrdb.src    US  Dept  Agriculture  agriculture  and  economic 

research 

wall-street-joumal-sample.src  ....  A  couple  of  months  worth  of  the  Wall  Street 

Journal (*) 

Graphics 

AVS_TXT_FILES.src   

comp.graphics.src    Previous    10    days    news    in  newsgroup 

comp.graphics 

sample-pictures.src    Sample  images  in  PICT  format 

Humanities 

acronyms.src    Large  list  of  acronyms  and  abbreviations 

Aesop-Fables. src   A  collection  of  over  300  fables  (RTF  format) 

ANU-Pacific-Manuscripts.src  ....  Catalogue  of  microfilm  collection  of  Pacific  studies 

at  the  Australian  National  University 
ANU-SocSci-Netlore.src    Network  resources  useful  to  humanities  and  social 

science  researchers 

ANU-Thai-Yunnan.src    Bibliography  and  notes  of  Thai-Yunnan  Project  at 

the  Australian  National  University 

bryn-mawr-clasical-review.src  ...  Review  of  books  in  Latin  and  Greek  classics 

bush-speeches. src    Speeches  and  information  from  the  office  of  former 

US  president  George  Bush 

clinton-speeches.src    Speeches  by  Bill  Clinton  for  1992  US  presidential 

campaign 

comp-acad-freedom.src    Computers  and  Academic  Freedom  lists  (policies, 

bibliographies  etc) 

computers-freedom-and-privacy.src  ..  Text  of  the  proceedings  of  the  conference 

"Computers,  Freedom  and  Privacy  II"  1992 

humanist.src    Volumes  of  the  Humanist  discussion  list 

maintained  at  Brown  University 

india-info.src    Miscellaneous  information  for  the  Indian 

community 

indian-classical-music.src    Music  titles  by  Indian  musicians 

israel-info.src    Information  on  the  State  of  Israel  (including  New 

East  Report  reprints) 

jiahr.src   Articles  from  the  Journal  of  International  Academy 

of  Hospitality 

MacPsych.src   Archive   of  discussion   on   software  for 

psychologists 

movie-lists.src    Archive  of  rec.arts.movies  lists  of  references  to  TV 

and  film  credits 

movie-reviews.src    Movie  reviews  submitted  by  network  newsgroup 

subscribers 

Omni-Cultural-Academic-Resource.src  ..  Collection  of  international/cultural  material 

including  food,  music,  language,  politics,  religion, 
travel  etc 

poetry.src    Complete  poetic  works,  including  the  complete 

poems  of  Shakespeare,  Yeats,  and  Elizabeth 
Sawyer 

proj-gutenberg.src   Documents  produced  by  Project  Gutenberg,  an 

effort  dedicated  to  the  creation  and  distribution  of 
English  language  electronic  texts 


roget-thesaurus.src   Roget's  Thesaurus,  provided  by  Project  Gutenberg 

sample-books.src    Sample  books  and  documents  indexed  at  Thinking 

Machines 

Science-Fiction-Series-Guide.src .  "Reviews"  of  the  major  works  of  selected  science 

fiction  writers,  and  list  of  works  on  alternate 
history  themes 

sf-reviews.src    Science  Fiction  review  articles 

simpsons.src   Capsules  for  each  episode  of  The  Simpsons 

thesaurus.src   As  roget-thesaurus.src 

toxic-custard- workshop. src           sarcastic/black  humour 

unced-agenda.src   Agenda  for  United  Nations  RIO  summit 

world-factbook.src   The  1990  World  Factbook  produced  by  the  CIA 

with  information  on  countries  and  cities  (*) 
world91a.src    1991  CIA  World  Factbook  with  information  on 

countries  and  cities 

Journalism 

factsheet-five.src    Information  on  'zines'   (underground,  low 

circulation  magazines) 

joumalism.periodicals.src   The  Journalism  Periodicals  Index 

london-free-press-regional-index.src  ..  Index  of  stories  in  the  London  Free  Press 

(London,  Canada) 

the-tech-v  1 12.src    1 12th  volume  of  The  Tech,  MITs  oldest  and  largest 

newspaper 

vpiej-l.src    Mailing  list  for  electronic  publishing  issues, 

especially  related  to  scholarly  electronic  journals 

Religion 

ANU-Asian-Religions.src    Bibliographic  references  to  (mainly  Buddhist) 

Asian  religions 

bible. src    King  James  version  of  the  Bible  (*) 

Book_of_Morraon.src   The  Book  of  Mormon  -  Gutenberg  version  1 1 

Quran.src   The  Koran 

Information  Sources 

aamet-resource-guide.src    A  copy  of  the  AARNet  Resource  Guide 

academic_email_conf.src   Info  on  newsgroups  and  electronic  conferences 

(including  Kovacs'  scholarly  e.c.  list) 
archie.au-ls-lRt.src   An  index  of  the  files  on  the  Australian  archive  site 

'archie.au' 

archie-orst.edu.src   Index  to  SURAnet  archie  database  of  computer 

software 

cicnet-resource-guide.src   Guide  to  some  internet  resources 

comp.doc.techreports.src    Availability     of     tech     reports  from 

corap.doc.techreports  newsgroup  and  various  FTP 

sites 

elec^oum_newslett.src    Information  on  electronic  journals  and  newsletters 

for  many  disciplines  (based  on  Strangelove's 
directory) 

fidonet-nodelistsrc    A  list  of  nodes  in  the  Fidonet  network 

file-archive-uunet.src   Directory  listing  of  the  archive  on  uunetuu.net 

finding-sources.src    Finding  information  on  the  network 

ftpable-readmes.src    Database  of  README  files  from  anonymous  FTP 

sites  around  the  world 

ftp-list,  src    Jon  Granrose's  anonymous  FTP  list 

jik-usenet.src    FAQ  articles  from  various  newsgroups 

lists. src    Several  master  lists  of  newsgroups,  mailing  lists, 

electronic  serials  and  journals 
netinfo.src   Index  of  text  files  relating  to  administration  of  the 


Internet 

network-bibliography.src    Network  related  bibliographies 

news.answers-faqs.src    Frequently  Asked  Questions  on  all  subjects  from 

news.answers  newsgroup 
news-conf.src    Conference    announcements    posted  to 

news.announce.conferences  newsgroup 
quake.think.com-ftp.src   Directory  of  README  files  at  the  Thinking 

Machines  Corp.  ftp  server 
UNC_BBS_Info.src   University  of  North  Carolina  bulletin  board 

services 

unc-ch-info.src    Most  of  University  of  North  Carolina's  INFO 

database  (maintained  by  Judy  Hallman) 

utsun.s.u-tokyo.ac.jp.src    Directory  of  major  Japanese  FTP  site  utsun.s.u- 

tokyo.ac.jp 

uunet.src    UUNET  directory  listing  of  FAqs  from  all 

newsgroups 

uxc.cso.uiuc.edu.src    Recursive  directory  listing  of  uxc.cso.uiuc.edu 

Law 

columbia-law-library.src    A  subset  of  the  Columbia  Law  School  online  card 

catalogue 


columbia-spanish-law-catalog  ....  Columbia  Law  School  index  to  Hispanic  legislation 
computers-freedom-and-privacy.src  ..  Text  of  the  proceedings  of  the  conference 

"Computers,  Freedom  and  Privacy  n"  1992 

eff-documents.src    Documents  and  newsletters  from  the  Electronic 

Frontier  Foundation  (education,  policy,  awareness, 
law  etc  applied  to  computers  and  communications) 

eff-talk.src    Archive  of  the  newsgroup  comp.org.eff.talk 

(Electronic  Frontier  Foundation) 

law-employers.src    Summary  of  legal  employees  in  the  US 

patent-sampler.src   About  2  weeks  of  patent  applications  at  the  US 

Patent  Office  (*) 

rkba.src   Files  relating  to  the  US  Right  to  Keep  and  Bear 

Arms 

supreme-courtsrc    US  Supreme  Court  decisions  in  full  text 

us-judges.src    Records  of  clerkship  application  requirements  for 

US  Federal  and  upper  level  State  courts 

Libraries  and  Catalogues 

bit.listserv.pacs-l.src    Discussion  about  computer  systems  provided  by 

libraries  to  their  patrons 
columbia-law-library.src    A  subset  of  the  Columbia  Law  School  online  card 

catalogue 

comp.intemet.librart.src   Index  to  newsgroup  on  electronic  libraries 

current.cites.src   Index  of  more  than  30  journals  for  articles  on 

electronic  publishing,  optical  disk  technologies, 
computer  networking,  information  transfer  and 
related  topics 

dit-library.src    Dept  of  Computer  Engineering,  Lund  University, 

Sweden,  library  catalogue 
hytelnetsrc  Information  sources  accessible  by  TELNET 

including  library  OPACs  (catalogues),  bulletin 

boards,  and  others 

inet-libraries.src    Information  on  accessing  Internet  and  Janet  (UK) 

accessible  libraries 

online-libraries-st-george.src          Art  St  George's  directory  of  libraries  and  CWIS's 

available  over  the  network,  together  with  access 
details 


tmc-library.src   A  catalogue  of  the  library  at  Thinking  Machines 

Corp. 

Mathematics 

bib-cirm.src    Books  and  conferences  proceedings  in  mathematics 

(French  keywords) 

bib-dmi-ens-fr.src   Bibliography  of  books  and  conference  proceedings 

on  maths  and  comp  sci  (French  keywords) 

netlib-index.src    Indexes  of  the  netlib  mathematical  software  archive 

s-archive.src   Mailing  list  archive  for  discussions  about  the  S 

statistical  analysis  software 
sas-archive.src   Mailing  list  archive  for  discussions  about  SAS 

statistical  analysis  software 
spss-archive.src    Mailing  list  archive  for  discussions  about  SPSS 

statistical  analysis  software 
stats-archive.src    US  statistics  theory  mailing  list  archive 

Miscellaneous 

edis.src    California's  Emergency  Digital  Information  System 

news  release  test  messages 
sustainable-agriculture.src    Information  on  appropriateness  of  technology, 

organic  farming,  gardening  etc 
weather.src  Weather  information,  including  surface  analysis 

weather  system  maps 
zipcodes.src   USA  zipcode  database 

Multimedia 

comp.multi.src    Index  of  news  group  comp.multimedia 

comp.text.sgml.src    Archive  of  Standard  Generalized  Markup  Language 

newsgroup 

disco-mm-zenon-inria-fr.src          Multimedia  documents  in  Internet  MIME 

multimedia  mail  format 
mime-samples.src    Multimedia  documents  in  Internet  MIME 

multimedia  mail  format 
SGML.src    Standard   Generalized   Markup  Language 

information 

SIGhyper.src    Documents  from  the  SGML  Users'  Group  SIG  on 

Hypertext  and  Multimedia 

Networks 

bcs-calendar.src   BCS  calendar  for  this  month  and  next  month 

bit.listserv.cdromlan.src    Archive  of  mailing  list  on  cdrom  products  and 

LANs 

comp.dcom.fax.src    Archive  of  comp.dcom.fax  newsgroup  (fax 

hardware,  software  and  protocols) 

com-priv.src   Discussions   about  issues   related   to  the 

commercialisation  and  privatisation  of  the  Internet 

disi-catalog.src    Availability    and    capability    of  X500 

implementations 

eff-documents.src   Documents  and  newsletters  from  the  Electronic 

Frontier  Foundation  (education,  policy,  awareness, 
law  etc  applied  to  computers  and  communications) 

eff-talk.src    Archive  of  the  newsgroup  comp. org. eff.talk 

(Electronic  Frontier  Foundation) 

matrix_news.src   Articles  etc.  from  Matrix  News  monthly  newsletter 

(Matrix  News  and  Directory  Services,  Inc.) 

mailing-lists.src   Lists  of  newsgroups,  mailing  lists,  electronic 

serials  and  journals,  with  access  details 

merit-nsfnet-linkletter.src    Articles  about  the  NSFNet  and  the  Internet 


network-tools.src    Descriptions  and  documentation  about  software 

tools  for  network  monitoring  and  management 

netbib.src    Bibliography   of   research   on  broadband 

networking,  video  and  sound 

phrack.src    All  issues  of  Phrack  -  an  old  hacking  and 

phreaking  newsletter 

ripe-database. src    RIPE  (Reseaux  IP  Europeens)  network  contacts 

database 

usenet-cookbook.src   The  USENET  Cookbook 

usenet-FAQ.src    Some  of  the  FAQ  articles  from  USENET 

x.SOO.working-group.src    Information  about  the  availability  and  capability  of 

X.500  implementations 

Documentation  and  Standards 

ietf-docs.src  Internet  Engineers  Task  Force  working  documents 

ietf-drafts.src   Internet  Engineers  Task  Force  drafts  and  working 

documents 

internet-documents.src    Database  of  Internet  Engineering  Task  Force 

(IETF)  documents,  including  working  group 
charters  and  minutes 

internet-drafts.src    Draft  copies  of  future  Internet  RFC  (Request  for 

Comment)  documents 
internet-resource  guide.src  Guide  to  using  the  Internet 

internet-rfcs.src   Internet  Request  for  Comment  documents 

intemet-standards.src    Subset  of  RFC's  (Request  For  Comment 

documents  of  internet  'standards') 
Internet-user-glossary.src    Glossary  of  internet  technical  terms  from  IETF 

working  group 

open_systems_calendar.src           Meetings  of  OS  committee  and  working  groups 

rfc-index.src    An  index  of  the  list  of  Internet  RFCs 

ripe-documents.src   All  documents  available  from  ftp.ripe.net  (RFC, 

IETF,  lESG,  RIPE  and  more) 

ripe-intemet-drafts.src    All  internet  draftsavailablefromftp.ripe.net 

ripe-rfc.src  ...AllRFCsavailablefromftp.ripe.net 

Security 

cert-advisories.src   Computer  Emergency  Response  Team  advisories 

on  OS  patches  to  correct  security  problems 

cert-clippings.src    CERT  clippings  on  security,  holes  and  patches 

from  various  newsgroups 

comp. risks. src   Archive  of  comp.risks  newsgroup  (risks  to  public 

in  computers) 

Using  the  Internet 

internet_services.src    Documents  describing  services  available  on  the 

Internet 

intemet_info.src  Texts,  guides  and  info  on  internet  use  and  ettiquette 

netinfo-biblio.src    Bibliography  of  documents  on  using  information 

services  on  the  Internet 
netinfo-docs.src    Various  files  with  information  on  accessing  the 

Internet  and  its  resources 
zen-internet.src    Zen  and  the  Art  of  the  Internet  document,  a 

network  introduction 

Phonebooks,  Mail  and  Computer  Lists 

bitearn.nodes.src    Database  of  computers  on  BITNET  center 

cissites.src    List  of  contacts  for  organisations  in  the  former 

Soviet  Union  who  have/plan  email 

college-email.src    Email  formats  for  US  Uni  students  (by  institution) 

congress.src    Names,  addresses  and  'pnone  numbers  for  each 


us  state  congressman 

domain-contacts.src   Internet  network  domains  and  their  contacts 

domain-organizations.src    Network  domain  names  and  organisations 

fidonet-nodelist.src    A  list  of  nodes  in  the  Fidonet  network 

info-nets.src    Archive  of  infonets  mailing  list 

intemet-domain-contacts.src          Internet  domains  and  contact  information  for  the 

responsible  parties 

internet-phonebook,  src    Index  of  the  NFS  Network  Service  Center 

Network  Managers  Phonebook 

irtf-rd.src    IRTF  Resource  Discovery  mailing  list 

monashuni-phonedir.src    Monash  University  'phone  directory 

online @ uunet.ca.src    The  Online  mailing  list  for  information  brokers  and 

other  people  who  search  on  line  databases 

SDSU_PhoneBook.src    San  Diego  State  University  'phone  directory 

sfsu-phones.src   San  Francisco  State  University  'phone  directory 

UNC_Staff_Phone.src    University    of  North  Carolina  staff  'phone 

directory 

UNC_Student_phone.src    Directory  of  students,  University    of  North 

Carolina,  Chapell  Hill  ,US  A 
uk-name-registration-service.src  .  Database  of  UK  hostnames  and  addresses 

usace-spk-phonebook.src    US  Array  Corps  of  Engineers  'phone  directory 

usenet-addresses.src   A  database  of  e-mail  addresses  of  people  who  post 

to  USENET 

uumap.src   Tracks  computers  that  are  UUCP  and  Usenet  sites 

around  the  world 

whois.src   Whois  service  for  finding  information  on  internet 

domains,  networks,  hosts,  organisations  and 
people 


Articles  on  flight  simulation  computer  games 
Discussion  on  die  art  of  brewing  your  own  (beer 
that  is) 

Archive  of  rec.arts.movies  lists  of  references  to  TV 
and  fibn  credits 

Movie  reviews  submitted  by  network  newsgroup 
subscribers 

Archive  of  information  on  Netrek  (game) 

Index  of  articles  from  the  rec.gardens  recreational 

gardening  newsgroup 

10  days  news  from  rec.pets  newsgroup  (dogs, 
cats,  etc) 


Music 

BGRASS-L.src    Archive  of  mailing  ist  on  discusion  of  Blue  Grass 

music 

cdbase.src   Database  of  compact  disk  titles,  record  company 

and  item  number 

early-m.src    Archives  of  discussions  on  early  (medieval, 

rennaisance,  baroque)  music  from  rec.music.early 
newsgroup  and  earlym-1  listserver 

lyrics.src    The  lyrics  for  a  selection  of  contemporary  music 

midi.  src   Musical  Instrument  Digital  Interface  documents 

music-surveys.src    Comments  on  performers  and  music  from 

rec.music  newsgroups 

MuTeX.src   Archive  on  discussion  on  TeX  typesetting  music 

from  the  Mutex  mailing  list 
rec.music.early.src   As  early-m.src 


Recreation 

falconS.src   

homebrew. src 

movie-lists.src 

movie-reviews.src 

netrek-ftp.src 
rec.gardens.src 

rec.pets.src  ..... 


Food 

recipes.src 


Recipes 


Robotics 

comp.robotics.src    Archive  of  comp.robotics  newsgroup 

Research  (Miscellaneous) 

cerro-l.src    Mailing  list  contributions  on  research  in  Central 

Europe  from  the  Central  European  Regional 
Research  Organisation 

eos-ncsu.src    Online  help  for  N.C.  State  University's  Project  Eos 

ut-research-expertise.src    University  of  Texas  Catalogue  of  Research 

Expertise 

UIO_Publications.src   Research  publications  at  the  University  of  Oslo 

US-Gov-Prograras.src    US  Government  research  programmes 

NCGIA-technical-reports.src         NCGIA  Technical  Reports 

nsf- awards. src   Abstracts  for  awards  made  by  the  US  National 

Science  Foundation 

nsf-pubs.src    Publications  of  the  US  National  Science 

Foundation 

unimelb-research.src   1990  University  of  Melbourne  (Australia)  research 

report 

Science  (general) 

sci.src   News  from  sci.*    (science)  newsgroups  e.g. 

aeronuatics,  electronics,  medicine,  physics,  space 
water-quality.src    Education  material  on  water  quality  assessment 


US  Government  Departments 

Ota. src    US  Ofice  of  Technology  Assessment  policy 

documents 

US-State-Department-Travel-Advisories.src  ..  Archive  of  mailing  list  of  US  State 

Dept  world  wide  consular  information  sheets  and 
travel  warnings 


APPENDIX  E:  SIGWAIS  MAILING  LIST 


SIGWAIS 
LIBRARY  OF  CONGRESS 
July  23,  1993 


The  following  is  the  the  list  of  those  attending  the  conference  July  23, 
1993,  and  of  those  wishing  to  be  included  on  the  mailing  list  only.     It  is 
final  as  of  August  19,  1993. 


Adams,  Marcia 

Smithsonian  Institution  Libraries 
Natural  History  Bldg.  Rm.  24M 
Washington,  D.C.  20560 
Voice:  202-357-2163 
Fax:  202-633-9291 
Internet:  LIBEM003@sivm.si.edu 

Aikins,  Mike 

National  Libraries  Project 

DEC  Australia 

Internet: 

aikins@akuna , enet . dec . com 

Allen,  Wayne 
EINet 

MCC/ISD,  3500  West  Balcones  Center 
Dr,  Austin,  Tx  78759 
Voice:  (512)338-3754 
Fax:  (512)338-3897 
Internet:  wa@mcc.com 

Allison,  G.  Burgess 
The  MITRE  Corporation 
7525  Colshire  Drive 
McLean,  VA  22102 
Voice:  703-883-7548 
Fax:  703-883-1367 
Internet:  allison@mitre.org 

Amon,  William  F. . 
MITRE  Corporation 
Office  of  Naval  Intelligence 
ONI  Bldg  1,  Room  301 
4301  Suitland  Road 
Washington,  DC  20395-5000 
Voice:  301-763-3586 
Fax:  301-967-3071 
Internet:  wamon@mitre.org 


Anderson,  William  L. 

Xerox  Corp.  817-03A 

295  Woodcliff  Drive 

Fairport,  NY     144  50 

Voice:  716-383-7983 

Fax:  716-264-5125 

Internet :  anderson . roch8 17  @xerox . com 

Annecchini ,  Frank 
DOE 

Voice:  202-586-4463 
Fax:  202-586-0746 

Arret,  Linda 
Library  of  Congress,  HSS 
Voice:  202-707-1490 
Internet:  larr@seql.loc.gov 

Ashley,  Maryle 
Library  of  Congress/ITS 
Washington,  DC  20540 
Voice:  202-707-9641 
Internet:  mash@mash.loc.gov 

Atlee,  Dick 

Voice:  301-405-3011 

Fax:  301-314-9220 

Internet:  atlee@umdd.umd.edu 

Aylor,  Jim 

University  of  Virginia 

Thornton  Hall,  Electrical  Engineering 

Charlottesville,  VA  22903 

Voice:  804/924-6100 

Fax:  804/924-8818 

Baker ,  Bob 

Argonne  National  Laboratory 
9700  South  Cass  Ave.  -  Bldg  900 
Argonne,  IL  60439 
Voice:      (708)  252-3608 
Fax:  -5128 
Internet:  baker@eid.anl.gov 


Barger,  Kyle 

Haverford  College  Academic  Computing 
370  Lancaster  Avenue 
Haverford,  PA  19041 
voice:  (215)  896-1373 

Internet :        kbargerehaverf ord . edu 

Barr,  John  . 
Health  Sciences  Libraries  Consortium 

Philadelphia,  PA 
Internet:  barr@hslc.org 

Bassetti,  Ottavia 

Thinking  Machines  Corporation 

245  First  Street 
Cambridge  MA  02142 
Voice:         (617)  234-5522 
Internet:  ottavia@think.com 

Bausenbach,  Ardie 
Library  of  Congress/APLO 
Washington,  DC  20540 
Voice:  202-707-2551 
Internet:  bausenba@mail.loc.gov 

Bean,  Charles 

Library  of  Congress 

serial  and  Government  Publications 

Division 

Washington,  DC  20540-0001 

voice:  (202)  707-2955 

Fax:  (202)707-6128 

Internet:  cbea@seql.loc.gov 

Belani,  Kaushi 
Library  of  Congress 
Washington,  DC  20540 
Voice:  202-707-9584 
Fax:  202-707-0955 
Internet:  belani@mail.loc.gov 

Belton,  Jen 

director,  the  Washington  post  news 

research  center 

1150  15th  st  nw 

Washington  dc  20071 

voice:  202  334-6762 

fax:  202  33^4-5575 

internet:  :jbelton@digex.net 


Bennett,  Nancy 

Information  Center 

Office  of  Technology  Assessment 

US  Congress 

Washington  DC  20510 

Voice:  202-228-6154 

Pax:  202-228-6098 

Internet:  nbennett@ota.gov 

Bernheisel,  Mary 
Library  of  Congress,  ITS 
Voice:  202-707-5375 
Fax:  202-707-0955 
internet:  mber@seql.loc.gov 

Bielawski,  Marvin 

Systems  Librarian 

Princeton  University  Libraries 

One  Washington  Road 

Princeton,  NJ  08544 

Internet: 

marvinb@pucc . princeton . edu 

Biemesderfer,  Chris 
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